The focus of this work is on the need to address the issue of ISPs selling network traffic data to streaming companies, as well as the importance of anonymizing the count of ports, protocols, and services for each connection flow. By anonymizing this information, the goal is to protect the privacy of users and prevent ISPs from exploiting or compromising their data.
Use the package manager pip to install requirements.txt.
pip install -r requirements.txt
Canadian Institute for CyberSecurity
IP Network Traffic Flows Labeled
To execute the mentioned files, follow these steps:
Make sure you have the necessary software and dependencies installed on your system, such as Python and any required libraries or frameworks.
Open a terminal or command prompt and navigate to the directory where the files are located.
Run the following file first:
preprocessing.py
This file should be executed before running any other files as it performs the necessary preprocessing steps. To execute the file, type python preprocessing.py or python3 preprocessing.py in the terminal or command prompt.
Monitor the execution process and check for any errors or output generated by the file.
After executing preprocessing.py, you can run the remaining files in any order based on your requirements.
baseline1-geometric.py
baseline2-log-laplace.py
baseline3-privbayes.py
baseline3-privbayes10.py
baseline3-privbayes11.py
baseline3-privbayes12.py
baseline3-privbayes2.py
baseline3-privbayes3.py
baseline3-privbayes4.py
baseline3-privbayes5.py
baseline3-privbayes6.py
baseline3-privbayes7.py
baseline3-privbayes8.py
baseline3-privbayes9.py
dpnettraffic.py
dpnettraffic_postprocessing.py
Note: The order of executing the remaining files doesn't matter after running preprocessing.py.
Depending on the file, you can execute them by typing python filename.py or python3 filename.py in the terminal or command prompt. The datasets can be found here
Monitor the execution process and check for any errors or output generated by the files.
By following these steps, you will be able to execute the mentioned files, ensuring that preprocessing.py is executed first for the necessary data preprocessing steps.
To process the processing time for each file, you can use the "elapsed_times.py" script. Here's a step-by-step guide:
elapsed_times.py
To process the results of our approach and the state-of-art algorithms (Mean Relative Error, TOP-K and, Processing Time) you can follow these general steps:
results_elapsed_times.py
results_errors.py
results_pvalue.py
results_topk.py
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.