Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

cesnet-datazoo

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

cesnet-datazoo

A toolkit for large network traffic datasets

  • 0.1.10
  • PyPI
  • Socket score

Maintainers
1

The goal of this project is to provide tools for working with large network traffic datasets and to facilitate research in the traffic classification area. The core functions of the cesnet-datazoo package are:

  • A common API for downloading, configuring, and loading of three public datasets of encrypted network traffic.
  • Extensive configuration options for:
    • Selection of train, validation, and test periods.
    • Selection of application classes and splitting classes between known and unknown.
    • Data transformations, such as feature scaling.
  • Built on suitable data structures for experiments with large datasets. There are several caching mechanisms to make repeated runs faster, for example, when searching for the best model configuration.
  • Datasets are offered in multiple sizes to give users an option to start the experiments at a smaller scale (also faster dataset download, disk space, etc.). The default is the S size containing 25 million samples.

:brain: :brain: See a related project CESNET Models providing pre-trained neural networks for traffic classification. :brain: :brain:

:notebook: :notebook: Example Jupyter notebooks are included in a separate CESNET Traffic Classification Examples repo. :notebook: :notebook:

Datasets

The cesnet-datazoo package currently provides three datasets with details in the following table (you might need to scroll the table horizontally to see all datasets).

  1. CESNET-TLS22
  2. CESNET-QUIC22
  3. CESNET-TLS-Year22
NameCESNET-TLS22CESNET-QUIC22CESNET-TLS-Year22
ProtocolTLSQUICTLS
Published in202220232023
Collection duration2 weeks4 weeks1 year
Collection period4.10.2021 - 17.10.202131.10.2022 - 27.11.20221.1.2022 - 31.12.2022
Application count191102180
Available samples141392195153226273507739073
Available dataset sizesXS, S, M, LXS, S, M, LXS, S, M, L
Citehttps://doi.org/10.1016/j.comnet.2022.109467https://doi.org/10.1016/j.dib.2023.108888https://doi.org/10.1038/s41597-024-03927-4
Zenodo URLhttps://zenodo.org/record/7965515https://zenodo.org/record/7963302https://zenodo.org/records/10608607
Related papershttps://doi.org/10.23919/TMA58422.2023.10199052

Installation

Install the package from pip with:

pip install cesnet-datazoo

or for editable install with:

pip install -e git+https://github.com/CESNET/cesnet-datazoo

Examples

Initialize dataset to create train, validation, and test dataframes
from cesnet_datazoo.datasets import CESNET_QUIC22
from cesnet_datazoo.config import DatasetConfig, AppSelection

dataset = CESNET_QUIC22("/datasets/CESNET-QUIC22/", size="XS")
dataset_config = DatasetConfig(
    dataset=dataset,
    apps_selection=AppSelection.ALL_KNOWN,
    train_period_name="W-2022-44",
    test_period_name="W-2022-45",
)
dataset.set_dataset_config_and_initialize(dataset_config)
train_dataframe = dataset.get_train_df()
val_dataframe = dataset.get_val_df()
test_dataframe = dataset.get_test_df()

The DatasetConfig class handles the configuration of datasets, and calling set_dataset_config_and_initialize initializes train, validation, and test sets with the desired configuration. Data can be read into Pandas DataFrames as shown here or via PyTorch DataLoaders. See CesnetDataset reference.

See more examples in the documentation.

Papers

Acknowledgments

This project was supported by the Ministry of the Interior of the Czech Republic, grant No. VJ02010024: Flow-Based Encrypted Traffic Analysis.

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc