anko
Toolkit for performing anomaly detection algorithm on 1D time series based on numpy, scipy.
Conventional approaches that based on statistical analysis have been implemented, with mainly two approaches included:
-
Normal Distribution
Data samples are presumably been generated by normal distribution, and therefore anomalous data points can be targeted by analysing the standard deviation.
-
Fitting Ansatz
Data samples are fitted by several ansatzs, and in accordance with the residual, anomalous data points can be selected.
Regarding model selections, models are adopted dynamically by performing normal test and by computing the (Akaike/Bayesian) information criterion.
By default, the algorithm will first try to fit in the data into normal distribution, if it passed normal test.
If this attempt suffers from the loss of convergence or it did not pass normal test from begining,
then the algorithm will pass data into the second methods and try to execute all the available fitting ansatzs simultaneously.
The best fitting ansatz will be selected by information criterion, and finally the algorithm will pick up anomalous points in accordance with the residual.
click here to see all available methods.
Future development will also include methods that are based on deep learning techniques, such as isolation forest, support vector machine, etc.
Requirements
- python >= 3.6.0
- numpy >= 1.16.4
- scipy >= 1.2.1
Installation
pip install anko
For current release version please refer to PyPI - anko homepage.
Documentation
For details about anko API, see the reference documentation.
Jupyter Notebook Tutorial (in dev)
Run anko_tutorial.ipynb on your local Jupyter Notebook or host on google colab.
Basic Usage
- Call AnomalyDetector
from anko.anomaly_detector import AnomalyDetector
agent = AnomalyDetector(t, series)
- Define policies and threshold values (optional)
agent.thres_params["linregress_res"] = 1.5
agent.apply_policies["z_normalization"] = True
agent.apply_policies["info_criterion"] = 'AIC'
for the use of AnomalyDetector.thres_params
and AnomalyDetector.apply_policies,
please refer to the documentation.
- Run check
check_result = agent.check()
The type of output check_result is CheckResult, which is basically a dictionary that contains the following attributes:
model: 'increase_step_func'
popt: [220.3243250055105, 249.03846355234577, 74.00000107457113]
perr: [0.4247789247961187, 0.7166253174634686, 0.0]
anomalous_data: [(59, 209)]
residual: [10.050378152592119]
extra_info: ['Info: AnomalyDetector is using z normalization.', 'Info: There are more than 1 discontinuous points detected.']
- model (str): The best fit model been selected by algorithm.
- popt (list): Estimated fitting parameters.
- perr (list): Corresponding errors of popt.
- anomalous_data (list[tuple(float, float)]): Return a list of anomalous data points (t, series(t)), or an empty list if all data points are in order.
- residual (list): Residual of anomalous data.
- extra_info (list): All convergence errors, warnings, informations during the execution are stored here.
Run Test
python -m unittest discover -s test -p '*_test.py'
or simply
make test