Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 263171
Loughborough University

Loughborough University Institutional Repository

Please use this identifier to cite or link to this item: https://dspace.lboro.ac.uk/2134/17480

Title: Empirical study of automatic dataset labelling
Authors: Aparicio-Navarro, Francisco J.
Kyriakopoulos, Konstantinos G.
Parish, David J.
Keywords: Automatic labelling
IEEE 802.11 datasets
Network traffic labelling
Unsupervised anomaly LDS
Issue Date: 2015
Publisher: © IEEE
Citation: APARICIO-NAVARRO, F.J., KYRIAKOPOULOS, K.G. and PARISH, D.J., 2015. Empirical study of automatic dataset labelling. IN: Proceedings of the 9th International Conference for Internet Technology and Secured Transactions, ICITST 2014, pp. 372 - 378.
Abstract: Correctly labelled dataseis are commonly required. Three particular scenarios are highlighted, which showcase this need. One of these scenarios is when using supervised Intrusion Detection Systems (TDSs). These systems need labelled datasets for their training process. Also, the real nature of analysed datasets must be known when evaluating the efficiency of IDSs detecting intrusions. The third scenario is the use of feature selection that works only if the processed datasets are labelled. In normal conditions, collecting labelled datasets from real communication networks is impossible. In a previous work we developed a novel approach to automatically generate labelled network traffic datasets using an unsupervised anomaly based IDS. The approach was empirically proven to be an efficient unsupervised labelling approach. It was evaluated using a single dataset. This paper extends our previous work by using a greater number of datasets, gathered from a real IEEE 802.11 network testbed. The datasets are comprised of different wireless-specific attacks. This paper also proposes a new and more precise method to calculate the boundary threshold, used in the labelling process.
Description: © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Sponsor: This work was supported by the Engineering and Physical Sciences Research Council (EPSRC) Grant number EP/ K014307/1 and the MOD University Research Collaboration in Signal Processing.
Version: Accepted for publication
DOI: 10.1109/ICITST.2014.7038840
URI: https://dspace.lboro.ac.uk/2134/17480
Publisher Link: http://dx.doi.org/10.1109/ICITST.2014.7038840
ISBN: 9781908320391
Appears in Collections:Published Articles (Mechanical, Electrical and Manufacturing Engineering)

Files associated with this item:

File Description SizeFormat
Empirical Study of Automatic Dataset Labelling.pdfAccepted version1.47 MBAdobe PDFView/Open


SFX Query

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.