Mobile CrowdSensing Dataset

Dataset and citation format:

Please use the following citation if you use the generated dataset for academic research:

Chen, Zhiyan, Murat Simsek, and Burak Kantarci. "Region-Aware Bagging and Deep Learning-Based Fake Task Detection in Mobile Crowdsensing Platforms." GLOBECOM 2020-2020 IEEE Global Communications Conference. IEEE, 2020. (bibtex)

Dataset Link:dataset

Dataset introduction

Dataset is generated by CrowdSenSim simulation tool [1]. Dataset contains legitimate tasks and fake tasks [2]. The task attributed are as follows: {’ID’, ’latitude’, ’longitude’, ’day’, ’hour’, ’minute’, ’duration’, ’remaining time’, ’battery requirement %’, ’Coverage’, ’legitimacy’, ’GridNumber’, ’OnpeakHour’}. Location of tasks are specified by ’latitude’ and ’longitude’ together. Furthermore, ’day’, ’hour’ and ’minute’ describe the task publish time. ’Duration’ denotes task active duration in terms of minutes. ’Remaining time’ denotes the residual time of a sensing task till its completion. ’Battery requirement’ is percentage of battery required to complete a task. ’Coverage’ denotes task sensing distance. ’Legitimacy’ describes whether a task is illegitimate one or legitimate one. This feature is used only in training of the machine learning models as the MCS platform is unaware of task legitimacy when a task is submitted. ’GridNumber’ is obtained by splitting sensing city map to small grids with numbers beginning at 1. ’OnpeakHour’ is a binary flag to indicate if task start time occurs during 7am to 11am. We define 7am to 11am as the peak hour and other hours are non-peak for the sake of simplicity in simulations. Based on the configuration of task generation in Table 1, the dataset is created including total 14,484 tasks, with 12,587 legitimate tasks and 1,897 fake tasks, respectively.

Table 1 Tasks generation configuration

 

Fake Tasks

Legitimate Tasks

Day

Uniformaly distributedly in [1, 6]

Uniformaly distributedly in [1, 6]

Hour

80%: 7am to 11am; 20%: 12pm to 5 pm

8%: 0am to 5am; 92%: 6pm to 23pm

Duration (min)

70% in {40, 50, 60}; 30% in {10, 20, 30}

Uniformly distributed over {10, 20, 30, 40, 50, 60}

Battery usage

80% in {7%-10%}; 20% in {1%-6%}

Uniformly distributed in {1%-10%}

Recruitment Radius

Uniformly distribute in 30m to 100m

Uniformly distribute in 30m to 100m

Movement Radius

[10m, 80m]

[10m, 80m]

Number of Tasks

1,897

12,578

 

Further References

Reference 1 is for the crowdsensim simulation platform; Reference 2 is the initial reference where the theat of fake tasks had been introduced.

  1. Fiandrino, Claudio, et al. "Crowdsensim: a simulation platform for mobile crowdsensing in realistic urban environments." IEEE Access 5 (2017): 3490-3503.
  2. Y. Zhang, B. Kantarci "AI-based Security Design of Mobile Crowdsensing Systems: Review, Challenges and Case Studies ," 13th IEEE International Conference on Service-Oriented System Engineering (SOSE), San Francisco East Bay, CA, USA, Apr. 2019