Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

pycm

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

pycm

Multi-class confusion matrix library in Python

Maintainers: 3

PyCM: Python Confusion Matrix

Overview

PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and accurate evaluation of a large variety of classifiers.

Fig1. ConfusionMatrix Block Diagram

Open Hub
PyPI Counter
Github Stars

Branch	master	dev
CI

Code Quality

Installation

⚠️ PyCM 3.9 is the last version to support Python 3.5

⚠️ PyCM 2.4 is the last version to support Python 2.7 & Python 3.4

⚠️ Plotting capability requires Matplotlib (>= 3.0.0) or Seaborn (>= 0.9.1)

PyPI

Check Python Packaging User Guide
Run pip install pycm==4.1

Source code

Download Version 4.1 or Latest Source
Run pip install .

Conda

Check Conda Managing Package
Update Conda using conda update conda
Run conda install -c sepandhaghighi pycm

MATLAB

Download and install MATLAB (>=8.5, 64/32 bit)
Download and install Python3.x (>=3.6, 64/32 bit)
- Select Add to PATH option
- Select Install pip option
Run pip install pycm
Configure Python interpreter

>> pyversion PYTHON_EXECUTABLE_FULL_PATH

Visit MATLAB Examples

Usage

From vector

>>> from pycm import *
>>> y_actu = [2, 0, 2, 2, 0, 1, 1, 2, 2, 0, 1, 2]
>>> y_pred = [0, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 2]
>>> cm = ConfusionMatrix(actual_vector=y_actu, predict_vector=y_pred)
>>> cm.classes
[0, 1, 2]
>>> cm.table
{0: {0: 3, 1: 0, 2: 0}, 1: {0: 0, 1: 1, 2: 2}, 2: {0: 2, 1: 1, 2: 3}}
>>> cm.print_matrix()
Predict 0       1       2       
Actual
0       3       0       0       

1       0       1       2       

2       2       1       3   

>>> cm.print_normalized_matrix()
Predict       0             1             2             
Actual
0             1.0           0.0           0.0           

1             0.0           0.33333       0.66667       

2             0.33333       0.16667       0.5          

>>> cm.stat(summary=True)
Overall Statistics : 

ACC Macro                                                         0.72222
F1 Macro                                                          0.56515
FPR Macro                                                         0.22222
Kappa                                                             0.35484
Overall ACC                                                       0.58333
PPV Macro                                                         0.56667
SOA1(Landis & Koch)                                               Fair
TPR Macro                                                         0.61111
Zero-one Loss                                                     5

Class Statistics :

Classes                                                           0             1             2             
ACC(Accuracy)                                                     0.83333       0.75          0.58333       
AUC(Area under the ROC curve)                                     0.88889       0.61111       0.58333       
AUCI(AUC value interpretation)                                    Very Good     Fair          Poor          
F1(F1 score - harmonic mean of precision and sensitivity)         0.75          0.4           0.54545       
FN(False negative/miss/type 2 error)                              0             2             3             
FP(False positive/type 1 error/false alarm)                       2             1             2             
FPR(Fall-out or false positive rate)                              0.22222       0.11111       0.33333       
N(Condition negative)                                             9             9             6             
P(Condition positive or support)                                  3             3             6             
POP(Population)                                                   12            12            12            
PPV(Precision or positive predictive value)                       0.6           0.5           0.6           
TN(True negative/correct rejection)                               7             8             4             
TON(Test outcome negative)                                        7             10            7             
TOP(Test outcome positive)                                        5             2             5             
TP(True positive/hit)                                             3             1             3             
TPR(Sensitivity, recall, hit rate, or true positive rate)         1.0           0.33333       0.5

Direct CM

>>> from pycm import *
>>> cm2 = ConfusionMatrix(matrix={"Class1": {"Class1": 1, "Class2": 2}, "Class2": {"Class1": 0, "Class2": 5}})
>>> cm2
pycm.ConfusionMatrix(classes: ['Class1', 'Class2'])
>>> cm2.classes
['Class1', 'Class2']
>>> cm2.print_matrix()
Predict      Class1       Class2       
Actual
Class1       1            2            

Class2       0            5            

>>> cm2.print_normalized_matrix()
Predict       Class1        Class2        
Actual
Class1        0.33333       0.66667       

Class2        0.0           1.0 

>>> cm2.stat(summary=True)
Overall Statistics : 

ACC Macro                                                         0.75
F1 Macro                                                          0.66667
FPR Macro                                                         0.33333
Kappa                                                             0.38462
Overall ACC                                                       0.75
PPV Macro                                                         0.85714
SOA1(Landis & Koch)                                               Fair
TPR Macro                                                         0.66667
Zero-one Loss                                                     2

Class Statistics :

Classes                                                           Class1        Class2        
ACC(Accuracy)                                                     0.75          0.75          
AUC(Area under the ROC curve)                                     0.66667       0.66667       
AUCI(AUC value interpretation)                                    Fair          Fair          
F1(F1 score - harmonic mean of precision and sensitivity)         0.5           0.83333       
FN(False negative/miss/type 2 error)                              2             0             
FP(False positive/type 1 error/false alarm)                       0             2             
FPR(Fall-out or false positive rate)                              0.0           0.66667       
N(Condition negative)                                             5             3             
P(Condition positive or support)                                  3             5             
POP(Population)                                                   8             8             
PPV(Precision or positive predictive value)                       1.0           0.71429       
TN(True negative/correct rejection)                               5             1             
TON(Test outcome negative)                                        7             1             
TOP(Test outcome positive)                                        1             7             
TP(True positive/hit)                                             1             5             
TPR(Sensitivity, recall, hit rate, or true positive rate)         0.33333       1.0

matrix() and normalized_matrix() renamed to print_matrix() and print_normalized_matrix() in version 1.5

Activation threshold

threshold is added in version 0.9 for real value prediction. For more information visit Example3

Load from file

file is added in version 0.9.5 in order to load saved confusion matrix with .obj format generated by save_obj method.

For more information visit Example4

Sample weights

sample_weight is added in version 1.2

For more information visit Example5

Transpose

transpose is added in version 1.2 in order to transpose input matrix (only in Direct CM mode)

Relabel

relabel method is added in version 1.5 in order to change ConfusionMatrix classnames.

>>> cm.relabel(mapping={0: "L1", 1: "L2", 2: "L3"})
>>> cm
pycm.ConfusionMatrix(classes: ['L1', 'L2', 'L3'])

Position

position method is added in version 2.8 in order to find the indexes of observations in predict_vector which made TP, TN, FP, FN.

>>> cm.position()
{0: {'FN': [], 'FP': [0, 7], 'TP': [1, 4, 9], 'TN': [2, 3, 5, 6, 8, 10, 11]}, 1: {'FN': [5, 10], 'FP': [3], 'TP': [6], 'TN': [0, 1, 2, 4, 7, 8, 9, 11]}, 2: {'FN': [0, 3, 7], 'FP': [5, 10], 'TP': [2, 8, 11], 'TN': [1, 4, 6, 9]}}

To array

to_array method is added in version 2.9 in order to returns the confusion matrix in the form of a NumPy array. This can be helpful to apply different operations over the confusion matrix for different purposes such as aggregation, normalization, and combination.

>>> cm.to_array()
array([[3, 0, 0],
       [0, 1, 2],
       [2, 1, 3]])
>>> cm.to_array(normalized=True)
array([[1.     , 0.     , 0.     ],
       [0.     , 0.33333, 0.66667],
       [0.33333, 0.16667, 0.5    ]])
>>> cm.to_array(normalized=True, one_vs_all=True, class_name="L1")
array([[1.     , 0.     ],
       [0.22222, 0.77778]])

Combine

combine method is added in version 3.0 in order to merge two confusion matrices. This option will be useful in mini-batch learning.

>>> cm_combined = cm2.combine(cm3)
>>> cm_combined.print_matrix()
Predict      Class1       Class2       
Actual
Class1       2            4            

Class2       0            10

Plot

plot method is added in version 3.0 in order to plot a confusion matrix using Matplotlib or Seaborn.

>>> cm.plot()

>>> from matplotlib import pyplot as plt
>>> cm.plot(cmap=plt.cm.Greens, number_label=True, plot_lib="matplotlib")

>>> cm.plot(cmap=plt.cm.Reds, normalized=True, number_label=True, plot_lib="seaborn")

ROC curve

ROCCurve, added in version 3.7, is devised to compute the Receiver Operating Characteristic (ROC) or simply ROC curve. In ROC curves, the Y axis represents the True Positive Rate, and the X axis represents the False Positive Rate. Thus, the ideal point is located at the top left of the curve, and a larger area under the curve represents better performance. ROC curve is a graphical representation of binary classifiers' performance. In PyCM, ROCCurve binarizes the output based on the "One vs. Rest" strategy to provide an extension of ROC for multi-class classifiers. Getting the actual labels vector, the target probability estimates of the positive classes, and the list of ordered labels of classes, this method is able to compute and plot TPR-FPR pairs for different discrimination thresholds and compute the area under the ROC curve.

>>> crv = ROCCurve(actual_vector=np.array([1, 1, 2, 2]), probs=np.array([[0.1, 0.9], [0.4, 0.6], [0.35, 0.65], [0.8, 0.2]]), classes=[2, 1])
>>> crv.thresholds
[0.1, 0.2, 0.35, 0.4, 0.6, 0.65, 0.8, 0.9]
>>> auc_trp = crv.area()
>>> auc_trp[1]
0.75
>>> auc_trp[2]
0.75

Precision-Recall curve

PRCurve, added in version 3.7, is devised to compute the Precision-Recall curve in which the Y axis represents the Precision, and the X axis represents the Recall of a classifier. Thus, the ideal point is located at the top right of the curve, and a larger area under the curve represents better performance. Precision-Recall curve is a graphical representation of binary classifiers' performance. In PyCM, PRCurve binarizes the output based on the "One vs. Rest" strategy to provide an extension of this curve for multi-class classifiers. Getting the actual labels vector, the target probability estimates of the positive classes, and the list of ordered labels of classes, this method is able to compute and plot Precision-Recall pairs for different discrimination thresholds and compute the area under the curve.

>>> crv = PRCurve(actual_vector=np.array([1, 1, 2, 2]), probs=np.array([[0.1, 0.9], [0.4, 0.6], [0.35, 0.65], [0.8, 0.2]]), classes=[2, 1])
>>> crv.thresholds
[0.1, 0.2, 0.35, 0.4, 0.6, 0.65, 0.8, 0.9]
>>> auc_trp = crv.area()
>>> auc_trp[1]
0.29166666666666663
>>> auc_trp[2]
0.29166666666666663

Parameter recommender

This option has been added in version 1.9 to recommend the most related parameters considering the characteristics of the input dataset. The suggested parameters are selected according to some characteristics of the input such as being balance/imbalance and binary/multi-class. All suggestions can be categorized into three main groups: imbalanced dataset, binary classification for a balanced dataset, and multi-class classification for a balanced dataset. The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper.

>>> cm.imbalance
False
>>> cm.binary
False
>>> cm.recommended_list
['MCC', 'TPR Micro', 'ACC', 'PPV Macro', 'BCD', 'Overall MCC', 'Hamming Loss', 'TPR Macro', 'Zero-one Loss', 'ERR', 'PPV Micro', 'Overall ACC']

is_imbalanced parameter has been added in version 3.3, so the user can indicate whether the concerned dataset is imbalanced or not. As long as the user does not provide any information in this regard, the automatic detection algorithm will be used.

>>> cm = ConfusionMatrix(y_actu, y_pred, is_imbalanced=True)
>>> cm.imbalance
True
>>> cm = ConfusionMatrix(y_actu, y_pred, is_imbalanced=False)
>>> cm.imbalance
False

Compare

In version 2.0, a method for comparing several confusion matrices is introduced. This option is a combination of several overall and class-based benchmarks. Each of the benchmarks evaluates the performance of the classification algorithm from good to poor and give them a numeric score. The score of good and poor performances are 1 and 0, respectively.

After that, two scores are calculated for each confusion matrices, overall and class-based. The overall score is the average of the score of seven overall benchmarks which are Landis & Koch, Cramer, Matthews, Goodman-Kruskal's Lambda A, Goodman-Kruskal's Lambda B, Krippendorff's Alpha, and Pearson's C. In the same manner, the class-based score is the average of the score of six class-based benchmarks which are Positive Likelihood Ratio Interpretation, Negative Likelihood Ratio Interpretation, Discriminant Power Interpretation, AUC value Interpretation, Matthews Correlation Coefficient Interpretation and Yule's Q Interpretation. It should be noticed that if one of the benchmarks returns none for one of the classes, that benchmarks will be eliminated in total averaging. If the user sets weights for the classes, the averaging over the value of class-based benchmark scores will transform to a weighted average.

If the user sets the value of by_class boolean input True, the best confusion matrix is the one with the maximum class-based score. Otherwise, if a confusion matrix obtains the maximum of both overall and class-based scores, that will be reported as the best confusion matrix, but in any other case, the compared object doesn’t select the best confusion matrix.

>>> cm2 = ConfusionMatrix(matrix={0: {0: 2, 1: 50, 2: 6}, 1: {0: 5, 1: 50, 2: 3}, 2: {0: 1, 1: 7, 2: 50}})
>>> cm3 = ConfusionMatrix(matrix={0: {0: 50, 1: 2, 2: 6}, 1: {0: 50, 1: 5, 2: 3}, 2: {0: 1, 1: 55, 2: 2}})
>>> cp = Compare({"cm2": cm2, "cm3": cm3})
>>> print(cp)
Best : cm2

Rank  Name   Class-Score       Overall-Score
1     cm2    0.50278           0.58095
2     cm3    0.33611           0.52857

>>> cp.best
pycm.ConfusionMatrix(classes: [0, 1, 2])
>>> cp.sorted
['cm2', 'cm3']
>>> cp.best_name
'cm2'

Multilabel confusion matrix

From version 4.0, MultiLabelCM has been added to calculate class-wise or sample-wise multilabel confusion matrices. In class-wise mode, confusion matrices are calculated for each class, and in sample-wise mode, they are generated per sample. All generated confusion matrices are binarized with a one-vs-rest transformation.

>>> mlcm = MultiLabelCM(actual_vector=[{"cat", "bird"}, {"dog"}], predict_vector=[{"cat"}, {"dog", "bird"}], classes=["cat", "dog", "bird"])
>>> mlcm.actual_vector_multihot
[[1, 0, 1], [0, 1, 0]]
>>> mlcm.predict_vector_multihot
[[1, 0, 0], [0, 1, 1]]
>>> mlcm.get_cm_by_class("cat").print_matrix()
Predict 0       1       
Actual
0       1       0       

1       0       1       

>>> mlcm.get_cm_by_sample(0).print_matrix()
Predict 0       1       
Actual
0       1       0       

1       1       1

Online help

online_help function is added in version 1.1 in order to open each statistics definition in web browser

>>> from pycm import online_help
>>> online_help("J")
>>> online_help("SOA1(Landis & Koch)")
>>> online_help(2)

List of items are available by calling online_help() (without argument)
If PyCM website is not available, set alt_link = True (new in version 2.4)

Screen record

Try PyCM in your browser!

PyCM can be used online in interactive Jupyter Notebooks via the Binder or Colab services! Try it out now! :

Check Examples in Document folder

Issues & bug reports

Fill an issue and describe it. We'll check it ASAP!
- Please complete the issue template
Discord : https://discord.com/invite/zqpU2b3J3f
Website : https://www.pycm.io
Mailing List : https://mail.python.org/mailman3/lists/pycm.python.org/
Email : info@pycm.io

Acknowledgments

NLnet foundation has supported the PyCM project from version 3.6 to 4.0 through the NGI Assure Fund. This fund is set up by NLnet foundation with funding from the European Commission's Next Generation Internet program, administered by DG Communications Networks, Content, and Technology under grant agreement No 957073.

Python Software Foundation (PSF) grants PyCM library partially for version 3.7. PSF is the organization behind Python. Their mission is to promote, protect, and advance the Python programming language and to support and facilitate the growth of a diverse and international community of Python programmers.

Some parts of the infrastructure for this project are supported by:

Cite

If you use PyCM in your research, we would appreciate citations to the following paper :

Haghighi, S., Jasemi, M., Hessabi, S. and Zolanvari, A. (2018). PyCM: Multiclass confusion matrix library in Python. Journal of Open Source Software, 3(25), p.729.


@article{Haghighi2018,
  doi = {10.21105/joss.00729},
  url = {https://doi.org/10.21105/joss.00729},
  year  = {2018},
  month = {may},
  publisher = {The Open Journal},
  volume = {3},
  number = {25},
  pages = {729},
  author = {Sepand Haghighi and Masoomeh Jasemi and Shaahin Hessabi and Alireza Zolanvari},
  title = {{PyCM}: Multiclass confusion matrix library in Python},
  journal = {Journal of Open Source Software}
}

Download PyCM.bib

JOSS
Zenodo

Show your support

Star this repo

Give a ⭐️ if this project helped you!

Donate to our project

If you do like our project and we hope that you do, can you please support us? Our project is not and is never going to be working for profit. We need the money just so we can continue doing what we do ;-) .

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

Unreleased

4.1 - 2024-10-17

Added

5 new distance/similarity
1. KoppenI
2. KoppenII
3. KuderRichardson
4. KuhnsI
5. KuhnsII
feature_request.yml template
config.yml for issue template
SECURITY.md

Changed

Bug report template modified
thresholds_calc function updated
__midpoint_numeric_integral__ function updated
__trapezoidal_numeric_integral__ function updated
Diagrams updated
Document modified
Document build system updated
AUTHORS.md updated
README.md modified
Test system modified
Python 3.12 added to test.yml
Python 3.13 added to test.yml
Warning and error messages updated
pycm_util.py renamed to utils.py
pycm_test.py renamed to basic_test.py
pycm_profile.py renamed to profile.py
pycm_param.py renamed to params.py
pycm_overall_func.py renamed to overall_funcs.py
pycm_output.py renamed to output.py
pycm_obj.py renamed to cm.py
pycm_multilabel_cm.py renamed to multilabel_cm.py
pycm_interpret.py renamed to interpret.py
pycm_handler.py renamed to handlers.py
pycm_error.py renamed to errors.py
pycm_distance.py renamed to distance.py
pycm_curve.py renamed to curve.py
pycm_compare.py renamed to compare.py
pycm_class_func.py renamed to class_funcs.py
pycm_ci.py renamed to ci.py

4.0 - 2023-06-07

Added

pycmMultiLabelError class
MultiLabelCM class
get_cm_by_class method
get_cm_by_sample method
__mlcm_vector_handler__ function
__mlcm_assign_classes__ function
__mlcm_vectors_filter__ function
__set_to_multihot__ function
deprecated function

Changed

Document modified
README.md modified
Example-4 modified
Test system modified
Python 3.5 support dropped

3.9 - 2023-05-01

Added

OVERALL_PARAMS dictionary
__imbalancement_handler__ function
vector_serializer function
NPV micro/macro
log_loss method
23 new distance/similarity
1. Dennis
2. Digby
3. Dispersion
4. Doolittle
5. Eyraud
6. Fager & McGowan
7. Faith
8. Fleiss-Levin-Paik
9. Forbes I
10. Forbes II
11. Fossum
12. Gilbert & Wells
13. Goodall
14. Goodman & Kruskal's Lambda
15. Goodman & Kruskal Lambda-r
16. Guttman's Lambda A
17. Guttman's Lambda B
18. Hamann
19. Harris & Lahey
20. Hawkins & Dotson
21. Kendall's Tau
22. Kent & Foster I
23. Kent & Foster II

Changed

metrics_off parameter added to ConfusionMatrix __init__ method
CLASS_PARAMS changed to a dictionary
Code style modified
sort parameter added to relabel method
Document modified
CONTRIBUTING.md updated
codecov removed from dev-requirements.txt
Test system modified

3.8 - 2023-02-01

Added

distance method
__contains__ method
__getitem__ method
Goodman-Kruskal's Lambda A benchmark
Goodman-Kruskal's Lambda B benchmark
Krippendorff's Alpha benchmark
Pearson's C benchmark
30 new distance/similarity
1. AMPLE
2. Anderberg's D
3. Andres & Marzo's Delta
4. Baroni-Urbani & Buser I
5. Baroni-Urbani & Buser II
6. Batagelj & Bren
7. Baulieu I
8. Baulieu II
9. Baulieu III
10. Baulieu IV
11. Baulieu V
12. Baulieu VI
13. Baulieu VII
14. Baulieu VIII
15. Baulieu IX
16. Baulieu X
17. Baulieu XI
18. Baulieu XII
19. Baulieu XIII
20. Baulieu XIV
21. Baulieu XV
22. Benini I
23. Benini II
24. Canberra
25. Clement
26. Consonni & Todeschini I
27. Consonni & Todeschini II
28. Consonni & Todeschini III
29. Consonni & Todeschini IV
30. Consonni & Todeschini V

Changed

relabel method sort bug fixed
README.md modified
Compare overall benchmarks default weights updated
Document modified
Test system modified

3.7 - 2022-12-15

Added

Curve class
ROCCurve class
PRCurve class
pycmCurveError class

Changed

CONTRIBUTING.md updated
matrix_params_calc function optimized
README.md modified
Document modified
Test system modified
Python 3.11 added to test.yml

3.6 - 2022-08-17

Added

Hamming distance
Braun-Blanquet similarity

Changed

classes parameter added to matrix_params_from_table function
Matrices with numpy.integer elements are now accepted
Arrays added to matrix parameter accepting formats
Website changed to http://www.pycm.io
Document modified
README.md modified

3.5 - 2022-04-27

Added

Anaconda workflow
Custom iterating setting
Custom casting setting

Changed

plot method updated
class_statistics function modified
overall_statistics function modified
BCD_calc function modified
CONTRIBUTING.md updated
CODE_OF_CONDUCT.md updated
Document modified

3.4 - 2022-01-26

Added

Colab badge
Discord badge
brier_score method

Changed

J (Jaccard index) section in Document.ipynb updated
save_obj method updated
Python 3.10 added to test.yml
Example-3 updated
Docstrings of the functions updated
CONTRIBUTING.md updated

3.3 - 2021-10-27

Added

__compare_weight_handler__ function

Changed

is_imbalanced parameter added to ConfusionMatrix __init__ method
class_benchmark_weight and overall_benchmark_weight parameters added to Compare __init__ method
statistic_recommend function modified
Compare weight parameter renamed to class_weight
Document modified
License updated
AUTHORS.md updated
README.md modified
Block diagrams updated

3.2 - 2021-08-11

Added

classes_filter function

Changed

classes parameter added to matrix_params_calc function
classes parameter added to __obj_vector_handler__ function
classes parameter added to ConfusionMatrix __init__ method
name parameter removed from html_init function
shortener parameter added to html_table function
shortener parameter added to save_html method
Document modified
HTML report modified

3.1 - 2021-03-11

Added

requirements-splitter.py
sensitivity_index method

Changed

Test system modified
overall_statistics function modified
HTML report modified
Document modified
References format updated
CONTRIBUTING.md updated

3.0 - 2020-10-26

Added

plot_test.py
axes_gen function
add_number_label function
plot method
combine method
matrix_combine function

Changed

Document modified
README.md modified
Example-2 deprecated
Example-7 deprecated
Error messages modified

2.9 - 2020-09-23

Added

notebook_check.py
to_array method
__copy__ method
copy method

Changed

average method refactored

2.8 - 2020-07-09

Added

label_map attribute
positions attribute
position method
Krippendorff's Alpha
Aickin's Alpha
weighted_alpha method

Changed

Single class bug fixed
CLASS_NUMBER_ERROR error type changed to pycmMatrixError
relabel method bug fixed
Document modified
README.md modified

2.7 - 2020-05-11

Added

average method
weighted_average method
weighted_kappa method
pycmAverageError class
Bangdiwala's B
MATLAB examples
Github action

Changed

Document modified
README.md modified
relabel method bug fixed
sparse_table_print function bug fixed
matrix_check function bug fixed
Minor bug in Compare class fixed
Class names mismatch bug fixed

2.6 - 2020-03-25

Added

custom_rounder function
complement function
sparse_matrix attribute
sparse_normalized_matrix attribute
Net benefit (NB)
Yule's Q interpretation (QI)
Adjusted Rand index (ARI)
TNR micro/macro
FPR micro/macro
FNR micro/macro

Changed

sparse parameter added to print_matrix,print_normalized_matrix and save_stat methods
header parameter added to save_csv method
Handler functions moved to pycm_handler.py
Error objects moved to pycm_error.py
Verified tests references updated
Verified tests moved to verified_test.py
Test system modified
CONTRIBUTING.md updated
Namespace optimized
README.md modified
Document modified
print_normalized_matrix method modified
normalized_table_calc function modified
setup.py modified
summary mode updated
Dockerfile updated
Python 3.8 added to .travis.yaml and appveyor.yml

Removed

PC_PI_calc function

2.5 - 2019-10-16

Added

__version__ variable
Individual classification success index (ICSI)
Classification success index (CSI)
Example-8 (Confidence interval)
install.sh
autopep8.sh
Dockerfile
CI method (supported statistics : ACC,AUC,Overall ACC,Kappa,TPR,TNR,PPV,NPV,PLR,NLR,PRE)

Changed

test.sh moved to .travis folder
Python 3.4 support dropped
Python 2.7 support dropped
AUTHORS.md updated
save_stat,save_csv and save_html methods Non-ASCII character bug fixed
Mixed type input vectors bug fixed
CONTRIBUTING.md updated
Example-3 updated
README.md modified
Document modified
CI attribute renamed to CI95
kappa_se_calc function renamed to kappa_SE_calc
se_calc function modified and renamed to SE_calc
CI/SE functions moved to pycm_ci.py
Minor bug in save_html method fixed

2.4 - 2019-07-31

Added

Tversky index (TI)
Area under the PR curve (AUPR)
FUNDING.yml

Changed

AUC_calc function modified
Document modified
summary parameter added to save_html,save_stat,save_csv and stat methods
sample_weight bug in numpy array format fixed
Inputs manipulation bug fixed
Test system modified
Warning system modified
alt_link parameter added to save_html method and online_help function
Compare class tests moved to compare_test.py
Warning tests moved to warning_test.py

2.3 - 2019-06-27

Added

Adjusted F-score (AGF)
Overlap coefficient (OC)
Otsuka-Ochiai coefficient (OOC)

Changed

save_stat and save_vector parameters added to save_obj method
Document modified
README.md modified
Parameters recommendation for imbalance dataset modified
Minor bug in Compare class fixed
pycm_help function modified
Benchmarks color modified

2.2 - 2019-05-30

Added

Negative likelihood ratio interpretation (NLRI)
Cramer's benchmark (SOA5)
Matthews correlation coefficient interpretation (MCCI)
Matthews's benchmark (SOA6)
F1 macro
F1 micro
Accuracy macro

Changed

Compare class score calculation modified
Parameters recommendation for multi-class dataset modified
Parameters recommendation for imbalance dataset modified
README.md modified
Document modified
Logo updated

2.1 - 2019-05-06

Added

Adjusted geometric mean (AGM)
Yule's Q (Q)
Compare class and parameters recommendation system block diagrams

Changed

Document links bug fixed
Document modified

2.0 - 2019-04-15

Added

G-Mean (GM)
Index of balanced accuracy (IBA)
Optimized precision (OP)
Pearson's C (C)
Compare class
Parameters recommendation warning
ConfusionMatrix equal method

Changed

Document modified
stat_print function bug fixed
table_print function bug fixed
Beta parameter renamed to beta (F_calc function & F_beta method)
Parameters recommendation for imbalance dataset modified
normalize parameter added to save_html method
pycm_func.py splitted into pycm_class_func.py and pycm_overall_func.py
vector_filter,vector_check,class_check and matrix_check functions moved to pycm_util.py
RACC_calc and RACCU_calc functions exception handler modified
Docstrings modified

1.9 - 2019-02-25

Added

Automatic/Manual (AM)
Bray-Curtis dissimilarity (BCD)
CODE_OF_CONDUCT.md
ISSUE_TEMPLATE.md
PULL_REQUEST_TEMPLATE.md
CONTRIBUTING.md
X11 color names support for save_html method
Parameters recommendation system
Warning message for high dimension matrix print
Interactive notebooks section (binder)

Changed

save_matrix and normalize parameters added to save_csv method
README.md modified
Document modified
ConfusionMatrix.__init__ optimized
Document and examples output files moved to different folders
Test system modified
relabel method bug fixed

1.8 - 2019-01-05

Added

Lift score (LS)
version_check.py

Changed

color parameter added to save_html method
Error messages modified
Document modified
Website changed to http://www.pycm.ir
Interpretation functions moved to pycm_interpret.py
Utility functions moved to pycm_util.py
Unnecessary else and elif removed
== changed to is

1.7 - 2018-12-18

Added

Gini index (GI)
Example-7
pycm_profile.py

Changed

class_name parameter added to stat,save_stat,save_csv and save_html methods
overall_param and class_param parameters empty list bug fixed
matrix_params_calc, matrix_params_from_table and vector_filter functions optimized
overall_MCC_calc, CEN_misclassification_calc and convex_combination functions optimized
Document modified

1.6 - 2018-12-06

Added

AUC value interpretation (AUCI)
Example-6
Anaconda cloud package

Changed

overall_param and class_param parameters added to stat,save_stat and save_html methods
class_param parameter added to save_csv method
_ removed from overall statistics names
README.md modified
Document modified

1.5 - 2018-11-26

Added

Relative classifier information (RCI)
Discriminator power (DP)
Youden's index (Y)
Discriminant power interpretation (DPI)
Positive likelihood ratio interpretation (PLRI)
__len__ method
relabel method
__class_stat_init__ function
__overall_stat_init__ function
matrix attribute as dict
normalized_matrix attribute as dict
normalized_table attribute as dict

Changed

README.md modified
Document modified
LR+ renamed to PLR
LR- renamed to NLR
normalized_matrix method renamed to print_normalized_matrix
matrix method renamed to print_matrix
entropy_calc fixed
cross_entropy_calc fixed
conditional_entropy_calc fixed
print_table bug for large numbers fixed
JSON key bug in save_obj fixed
transpose bug in save_obj fixed
Python 3.7 added to .travis.yaml and appveyor.yml

1.4 - 2018-11-12

Added

Area under curve (AUC)
AUNU
AUNP
Class balance accuracy (CBA)
Global performance index (RR)
Overall MCC
Distance index (dInd)
Similarity index (sInd)
one_vs_all
dev-requirements.txt

Changed

README.md modified
Document modified
save_stat modified
requirements.txt modified

1.3 - 2018-10-10

Added

Confusion entropy (CEN)
Overall confusion entropy (Overall CEN)
Modified confusion entropy (MCEN)
Overall modified confusion entropy (Overall MCEN)
Information score (IS)

Changed

README.md modified

1.2 - 2018-10-01

Added

No information rate (NIR)
P-Value
sample_weight
transpose

Changed

README.md modified
Key error in some parameters fixed
OSX env added to .travis.yml

1.1 - 2018-09-08

Added

Zero-one loss
Support
online_help function

Changed

README.md modified
html_table function modified
table_print function modified
normalized_table_print function modified

1.0 - 2018-08-30

Added

Hamming loss

Changed

README.md modified

0.9.5 - 2018-07-08

Added

Obj load
Obj save
Example-4

Changed

README.md modified
Block diagram updated

0.9 - 2018-06-28

Added

Activation threshold
Example-3
Jaccard index
Overall Jaccard index

Changed

README.md modified
setup.py modified

0.8.6 - 2018-05-31

Added

Example section in document
Python 2.7 CI
JOSS paper pdf

Changed

Cite section
ConfusionMatrix docstring
round function changed to numpy.around
README.md modified

0.8.5 - 2018-05-21

Added

Example-1 (Comparison of three different classifiers)
Example-2 (How to plot via matplotlib)
JOSS paper
ConfusionMatrix docstring

Changed

Table size in HTML report
Test system
README.md modified

0.8.1 - 2018-03-22

Added

Goodman and Kruskal's lambda B
Goodman and Kruskal's lambda A
Cross entropy
Conditional entropy
Joint entropy
Reference entropy
Response entropy
Kullback-Liebler divergence
Direct ConfusionMatrix
Kappa unbiased
Kappa no prevalence
Random accuracy unbiased
pycmVectorError class
pycmMatrixError class
Mutual information
Support numpy arrays

Changed

Notebook file updated

Removed

pycmError class

0.7 - 2018-02-26

Added

Cramer's V
95% confidence interval
Chi-Squared
Phi-Squared
Chi-Squared DF
Standard error
Kappa standard error
Kappa 95% confidence interval
Cicchetti benchmark

Changed

Overall statistics color in HTML report
Parameters description link in HTML report

0.6 - 2018-02-21

Added

CSV report
Changelog
Output files
digit parameter to ConfusionMatrix object

Changed

Confusion matrix color in HTML report
Parameters description link in HTML report
Capitalize descriptions

0.5 - 2018-02-17

Added

Scott's pi
Gwet's AC1
Bennett S score
HTML report

0.4 - 2018-02-05

Added

TPR micro/macro
PPV micro/macro
Overall RACC
Error rate (ERR)
FBeta score
F0.5
F2
Fleiss benchmark
Altman benchmark
Output file(.pycm)

Changed

Class with zero item
Normalized matrix

Removed

Kappa and SOA for each class

0.3 - 2018-01-27

Added

Kappa
Random accuracy
Landis and Koch benchmark
overall_stat

0.2 - 2018-01-24

Added

Population
Condition positive
Condition negative
Test outcome positive
Test outcome negative
Prevalence
G-measure
Matrix method
Normalized matrix method
Params method

Changed

statistic_result to class_stat
params to stat

0.1 - 2018-01-22

Added

ACC
BM
DOR
F1-Score
FDR
FNR
FOR
FPR
LR+
LR-
MCC
MK
NPV
PPV
TNR
TPR
documents and README.md

Keywords

confusion-matrix python3 python machine_learning ML

FAQs

What is pycm?

Is pycm well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install