Contiki-NG-Based IoT Networks Exposed to NSL-KDD Dataset

Dataset and Citation Format:

Please use the following citation if you use the generated dataset for academic research:  

J. Liu, B. Kantarci, C. Adams "Machine Learning-Driven Intrusion Detection for Contiki-NG-Based IoT Networks Exposed to NSL-KDD Dataset" ACM Workshop on Wireless Security and Machine Learning (WiseML), Linz, Austria (Virtual Event), July 2020 [invited].

Dataset Link: Click here

System Settings

In the simulations, with the settings presented in Fig.1 , we first implement regular traffic by randomly distributing legitimate nodes. As shown in Table 1 three different kinds of legitimate nodes exist in the simulation: nodes with the UDP protocol; nodes with the TCP protocol; and a sink node which serves as a TCP and UDP server. To simulate realistic WSN traffic, each node randomly sends data with data size that varies according to the TCP and UDP protocols; these nodes are deployed in random locations.

Network packets are collected at the sink node by filtering the IP and MAC address. Seven attacks in the NSL-KDD dataset are implemented in Contiki-NG1 and simulated using Cooja2. Upon the collection of packets, PCAP files are fed into a feature extractor in order to form the intrusion dataset.

Fig.1 System Archetecture

Table 1: IoT Network Node Types

Node Type

Num

Description

Sink Node

1

Aggregates network traffic and can receives UDP and TCP taffics

Sensor Node (TCP)

10

Simulated as sensor node used for collecting environmental and transfer data via TCP protocol

Sensor Node (UDP)

10

Transfer data via UDP protocol

Malicious Node

1

Simulates the attacker that launches different attacks

Attack Types

The NSL-KDD3 attack techniques that we introduced into this dataset include:

The distribution of this dataset is shown as Fig.2

Fig.2 Dataset Distributions

 

Features

Customizable Network intrusion dataset creator4 is leveraged in this work to extract features. To make it suitable for IPv6, ICMPv6, and 6LoWPAN protocols, we modify the feature extraction module and add more features related to the ICMPv6 protocol, IP address, and MAC address. Finally, twenty-eight features are extracted from the network flow, and each entry also has a label denoting the the type of the attack introduced through the corresponding packets. The extracted features can be categorized as follows:

 


1 Contiki-NG GitHub: https://github.com/contiki-ng/contiki-ng

2 Cooja Simulater GitHub: https://github.com/contiki-ng/cooja/tree/63538bbb882ba06a7b8cf97c11ce2fe4d22e4f88

3 NSL-KDD Dataset: https://www.unb.ca/cic/datasets/nsl.html

4 Customizable Network Intrusion Dataset Creator GitHub: https://github.com/nrajasin/Network-intrusion-dataset-creator