Security News
Fluent Assertions Faces Backlash After Abandoning Open Source Licensing
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Orion is a machine learning library built for unsupervised time series anomaly detection.
An open source project from Data to AI Lab at MIT.
A machine learning library for unsupervised time series anomaly detection.
Important Links | |
---|---|
:computer: Website | Check out the Sintel Website for more information about the project. |
:book: Documentation | Quickstarts, User and Development Guides, and API Reference. |
:star: Tutorials | Checkout our notebooks |
:octocat: Repository | The link to the Github Repository of this library. |
:scroll: License | The repository is published under the MIT License. |
Community | Join our Slack Workspace for announcements and discussions. |
Orion is a machine learning library built for unsupervised time series anomaly detection. With a given time series data, we provide a number of “verified” ML pipelines (a.k.a Orion pipelines) that identify rare patterns and flag them for expert review.
The library makes use of a number of automated machine learning tools developed under Data to AI Lab at MIT.
Read about using an Orion pipeline on NYC taxi dataset in a blog series:
Part 1: Learn about unsupervised time series anomaly detection | Part 2: Learn how we use GANs to solving the problem? | Part 3: How does one evaluate anomaly detection pipelines? |
---|---|---|
Notebooks: Discover Orion through colab by launching our notebooks!
The easiest and recommended way to install Orion is using pip:
pip install orion-ml
This will pull and install the latest stable release from PyPi.
In the following example we show how to use one of the Orion Pipelines.
We will load a demo data for this example:
from orion.data import load_signal
train_data = load_signal('S-1-train')
train_data.head()
which should show a signal with timestamp
and value
.
timestamp value
0 1222819200 -0.366359
1 1222840800 -0.394108
2 1222862400 0.403625
3 1222884000 -0.362759
4 1222905600 -0.370746
In this example we use aer
pipeline and set some hyperparameters (in this case training epochs as 5).
from orion import Orion
hyperparameters = {
'orion.primitives.aer.AER#1': {
'epochs': 5,
'verbose': True
}
}
orion = Orion(
pipeline='aer',
hyperparameters=hyperparameters
)
orion.fit(train_data)
Once it is fitted, we are ready to use it to detect anomalies in our incoming time series:
new_data = load_signal('S-1-new')
anomalies = orion.detect(new_data)
:warning: Depending on your system and the exact versions that you might have installed some WARNINGS may be printed. These can be safely ignored as they do not interfere with the proper behavior of the pipeline.
The output of the previous command will be a pandas.DataFrame
containing a table of detected anomalies:
start end severity
0 1402012800 1403870400 0.122539
In every release, we run Orion benchmark. We maintain an up-to-date leaderboard with the current scoring of the verified pipelines according to the benchmarking procedure.
We run the benchmark on 12 datasets with their known grounth truth. We record the score of the pipelines on each datasets. To compute the leaderboard table, we showcase the number of wins each pipeline has over the ARIMA pipeline.
Pipeline | Outperforms ARIMA |
---|---|
AER | 11 |
TadGAN | 7 |
LSTM Dynamic Thresholding | 9 |
LSTM Autoencoder | 6 |
Dense Autoencoder | 8 |
VAE | 6 |
AnomalyTransformer | 2 |
LNN | 7 |
Matrix Profile | 5 |
UniTS | 6 |
GANF | 5 |
Azure | 0 |
You can find the scores of each pipeline on every signal recorded in the details Google Sheets document. The summarized results can also be browsed in the following summary Google Sheets document.
Additional resources that might be of interest:
If you use AER for your research, please consider citing the following paper:
Lawrence Wong, Dongyu Liu, Laure Berti-Equille, Sarah Alnegheimish, Kalyan Veeramachaneni. AER: Auto-Encoder with Regression for Time Series Anomaly Detection.
@inproceedings{wong2022aer,
title={AER: Auto-Encoder with Regression for Time Series Anomaly Detection},
author={Wong, Lawrence and Liu, Dongyu and Berti-Equille, Laure and Alnegheimish, Sarah and Veeramachaneni, Kalyan},
booktitle={2022 IEEE International Conference on Big Data (IEEE BigData)},
pages={1152-1161},
doi={10.1109/BigData55660.2022.10020857},
organization={IEEE},
year={2022}
}
If you use TadGAN for your research, please consider citing the following paper:
Alexander Geiger, Dongyu Liu, Sarah Alnegheimish, Alfredo Cuesta-Infante, Kalyan Veeramachaneni. TadGAN - Time Series Anomaly Detection Using Generative Adversarial Networks.
@inproceedings{geiger2020tadgan,
title={TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks},
author={Geiger, Alexander and Liu, Dongyu and Alnegheimish, Sarah and Cuesta-Infante, Alfredo and Veeramachaneni, Kalyan},
booktitle={2020 IEEE International Conference on Big Data (IEEE BigData)},
pages={33-43},
doi={10.1109/BigData50022.2020.9378139},
organization={IEEE},
year={2020}
}
If you use Orion which is part of the Sintel ecosystem for your research, please consider citing the following paper:
Sarah Alnegheimish, Dongyu Liu, Carles Sala, Laure Berti-Equille, Kalyan Veeramachaneni. Sintel: A Machine Learning Framework to Extract Insights from Signals.
@inproceedings{alnegheimish2022sintel,
title={Sintel: A Machine Learning Framework to Extract Insights from Signals},
author={Alnegheimish, Sarah and Liu, Dongyu and Sala, Carles and Berti-Equille, Laure and Veeramachaneni, Kalyan},
booktitle={Proceedings of the 2022 International Conference on Management of Data},
pages={1855–1865},
numpages={11},
publisher={Association for Computing Machinery},
doi={10.1145/3514221.3517910},
series={SIGMOD '22},
year={2022}
}
New units
pipeline
scikit-learn
and fix MinMaxScaler
– Issue #596 by @sarahmishunits
pipelines – Issue #595 by @sarahmishSupport for python 3.10 and 3.11
test_core
file – Issue #507 by @sarahmishSupport for python 3.9 and new Matrix Profile pipeline
This version introduces a new dataset to the benchmark.
azure
pipeline – Issue #436 by @sarahmishThis version uses ml-stars
package instead of mlprimitives
.
best_cost
in find_anomalies
primitive – Issue #403 by @sarahmishlstm_dynamic_threshold_gpu
and lstm_autoencoder_gpu
pipeline maintenance – Issue #373 by @sarahmishorion.evaluate
uses fails when fitting – Issue #384 by @sarahmishopencv
– Issue #372 by @sarahmishscikit-learn
– Issue #367 by @sarahmishThis version introduces several new enhancements:
VAE
, a Variational AutoEncoder model.regression_errors
– Issue #352 by @dyuliubatch_size
cannot be changed – Issue #313 by @sarahmishThis version fixes some of the issues in aer
, ae
, and tadgan
pipelines.
window_size
– Issue #300 by @sarahmishThis version introduce a new pipeline, namely AER
, an AutoEncoder Regressor model.
This version deprecates the support of OrionDBExplorer
, which has been migrated to
sintel. As a result, Orion
no longer requires
mongoDB as a dependency.
This version introduces improvements and more testing.
This version supports multivariate timeseries as input. In addition to minor improvements and maintenance.
setuptools
no longer supports lib2to3
breaking mongoengine
- Issue #252 by @sarahmishwindow_size
- Issue #87 by @sarahmishThis version adds new features to the benchmark function where users can now save pipelines, view results as they are being calculated, and allow a single evaluation to be compared multiple times.
This version introduces two new pipelines: LSTM AE and Dense AE.
In addition to minor improvements, a bit of code refactoring took place to introduce
a new primtive: reconstruction_errors
.
BENCHMARK.md
and the docs - Issue #148 by @sarahmishThis version includes the new style of documentation and a revamp of the README.md
. In addition to some minor improvements
in the benchmark code and primitives. This release includes the transfer of tadgan
pipeline to verified
.
timeseries_anomalies
unittests - Issue #136 by @sarahmishfind_sequences
in converting series to arrays - Issue #135 by @sarahmishMinor enhancements to benchmark
New benchmark and Azure primitive.
find_anomalies
- Issue #101 by @sarahmishNew Evaluation sub-package and refactor TadGAN.
README.md
and HISTORY.md
- Issue #88 by @dyuliuscore_anomaly
in Cyclegan primitive - Issue #86 by @dyuliuepoch
meaning in Cyclegan primitive - Issue #85 by @sarahmishNew class and function based interfaces.
First Orion release to PyPI: https://pypi.org/project/orion-ml/
FAQs
Orion is a machine learning library built for unsupervised time series anomaly detection.
We found that orion-ml demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 4 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Fluent Assertions is facing backlash after dropping the Apache license for a commercial model, leaving users blindsided and questioning contributor rights.
Research
Security News
Socket researchers uncover the risks of a malicious Python package targeting Discord developers.
Security News
The UK is proposing a bold ban on ransomware payments by public entities to disrupt cybercrime, protect critical services, and lead global cybersecurity efforts.