Pandas matrix confusion with plot features (matplotlib, seaborn...)
Project description
pandas_confusion
A Python Pandas implementation of confusion matrix.
WORK IN PROGRESS - Use it a your own risk
Usage
Confusion matrix
Import ConfusionMatrix
from pandas_confusion import ConfusionMatrix
Define actual values (y_actu) and predicted values (y_pred)
y_actu = ['rabbit', 'cat', 'rabbit', 'rabbit', 'cat', 'dog', 'dog', 'rabbit', 'rabbit', 'cat', 'dog', 'rabbit'] y_pred = ['cat', 'cat', 'rabbit', 'dog', 'cat', 'rabbit', 'dog', 'cat', 'rabbit', 'cat', 'rabbit', 'rabbit']
Let’s define a (non binary) confusion matrix
confusion_matrix = ConfusionMatrix(y_actu, y_pred) print("Confusion matrix:\n%s" % confusion_matrix)
You can see it
Predicted cat dog rabbit __all__ Actual cat 3 0 0 3 dog 0 1 2 3 rabbit 2 1 3 6 __all__ 5 2 5 12
Matplotlib plot of a confusion matrix
Inside a IPython notebook add this line as first cell
%matplotlib inline
You can plot confusion matrix using:
import matplotlib.pyplot as plt confusion_matrix.plot()
If you are not using inline mode, you need to use to show confusion matrix plot.
plt.show()
Matplotlib plot of a normalized confusion matrix
confusion_matrix.plot(normalized=True) plt.show()
Binary confusion matrix
Import BinaryConfusionMatrix and Backend
from pandas_confusion import BinaryConfusionMatrix, Backend
Define actual values (y_actu) and predicted values (y_pred)
y_actu = [ True, True, False, False, False, True, False, True, True, False, True, False, False, False, False, False, True, False, True, True, True, True, False, False, False, True, False, True, False, False, False, False, True, True, False, False, False, True, True, True, True, False, False, False, False, True, False, False, False, False, False, False, False, False, False, True, True, False, True, False, True, True, True, False, False, True, False, True, False, False, True, False, False, False, False, False, False, False, False, True, False, True, True, True, True, False, False, True, False, True, True, False, True, False, True, False, False, True, True, False, False, True, True, False, False, False, False, False, False, True, True, False] y_pred = [False, False, False, False, False, True, False, False, True, False, True, False, False, False, False, False, False, False, True, True, True, True, False, False, False, False, False, False, False, False, False, False, True, False, False, False, False, True, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, True, False, False, False, False, False, False, False, False, False, True, False, False, False, False, True, False, False, False, False, False, False, False, False, True, False, False, True, False, False, False, False, True, False, True, True, False, False, False, True, False, False, True, True, False, False, True, True, False, False, False, False, False, False, True, False, False]
Let’s define a binary confusion matrix
binary_confusion_matrix = BinaryConfusionMatrix(y_actu, y_pred) print("Binary confusion matrix:\n%s" % binary_confusion_matrix)
It display as a nicely labeled Pandas DataFrame
Binary confusion matrix: Predicted False True __all__ Actual False 67 0 67 True 21 24 45 __all__ 88 24 112
You can get useful attributes such as True Positive (TP), True Negative (TN) …
print binary_confusion_matrix.TP
Matplotlib plot of a binary confusion matrix
binary_confusion_matrix.plot() plt.show()
Matplotlib plot of a normalized binary confusion matrix
binary_confusion_matrix.plot(normalized=True) plt.show()
Seaborn plot of a binary confusion matrix (ToDo)
from pandas_confusion import Backend binary_confusion_matrix.plot(backend=Backend.Seaborn)
Confusion matrix and class statistics
Overall statistics and class statistics of confusion matrix can be easily displayed.
y_true = [600, 200, 200, 200, 200, 200, 200, 200, 500, 500, 500, 200, 200, 200, 200, 200, 200, 200, 200, 200] y_pred = [100, 200, 200, 100, 100, 200, 200, 200, 100, 200, 500, 100, 100, 100, 100, 100, 100, 100, 500, 200] cm = ConfusionMatrix(y_true, y_pred) cm.print_stats()
You should get:
Confusion Matrix: Classes 100 200 500 600 __all__ Actual 100 0 0 0 0 0 200 9 6 1 0 16 500 1 1 1 0 3 600 1 0 0 0 1 __all__ 11 7 2 0 20 Overall Statistics: Accuracy: 0.35 95% CI: (0.1539092047845412, 0.59218853453282805) No Information Rate: ToDo P-Value [Acc > NIR]: 0.978585644357 Kappa: 0.0780141843972 Mcnemar's Test P-Value: ToDo Class Statistics: Classes 100 200 500 600 Population 20 20 20 20 Condition positive 0 16 3 1 Condition negative 20 4 17 19 Test outcome positive 11 7 2 0 Test outcome negative 9 13 18 20 TP: True Positive 0 6 1 0 TN: True Negative 9 3 16 19 FP: False Positive 11 1 1 0 FN: False Negative 0 10 2 1 TPR: Sensivity NaN 0.375 0.3333333 0 TNR=SPC: Specificity 0.45 0.75 0.9411765 1 PPV: Pos Pred Value = Precision 0 0.8571429 0.5 NaN NPV: Neg Pred Value 1 0.2307692 0.8888889 0.95 FPR: False-out 0.55 0.25 0.05882353 0 FDR: False Discovery Rate 1 0.1428571 0.5 NaN FNR: Miss Rate NaN 0.625 0.6666667 1 ACC: Accuracy 0.45 0.45 0.85 0.95 F1 score 0 0.5217391 0.4 0 MCC: Matthews correlation coefficient NaN 0.1048285 0.326732 NaN Informedness NaN 0.125 0.2745098 0 Markedness 0 0.08791209 0.3888889 NaN Prevalence 0 0.8 0.15 0.05 LR+: Positive likelihood ratio NaN 1.5 5.666667 NaN LR-: Negative likelihood ratio NaN 0.8333333 0.7083333 1 DOR: Diagnostic odds ratio NaN 1.8 8 NaN FOR: False omission rate 0 0.7692308 0.1111111 0.05
Statistics are also available as an OrderedDict using:
cm.stats()
Install
$ conda install pandas scikit-learn scipy $ pip install pandas_confusion
Development
You can help to develop this library.
Issues
You can submit issues using https://github.com/scls19fr/pandas_confusion/issues
Clone
You can clone repository to try to fix issues yourself using:
$ git clone https://github.com/scls19fr/pandas_confusion.git
Run unit tests
Run all unit tests
$ nosetests -s -v
Run a given test
$ nosetests -s -v tests/test_pandas_confusion.py:test_pandas_confusion_normalized
Install development version
$ python setup.py install
or
$ sudo pip install git+git://github.com/scls19fr/pandas_confusion.git
Collaborating
Fork repository
Create a branch which fix a given issue
Submit pull requests
Done
Continuous integration (Travis)
Convert a confusion matrix to a binary confusion matrix
Python package
Unit tests (nose)
Fix missing column and missing row
Overall statistics: Accuracy, 95% CI, P-Value [Acc > NIR], Kappa
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pandas_confusion-0.0.6.tar.gz
.
File metadata
- Download URL: pandas_confusion-0.0.6.tar.gz
- Upload date:
- Size: 13.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 676e7b4f7e77d014352a21026ec8d1f42f397be9313e5f76fd62b87e2f509b83 |
|
MD5 | b1879f10935b80eb1e4c03a07ed5547e |
|
BLAKE2b-256 | 6c0a663e01667aa7959df15e64eab39a0b3608575de349c678894f0416530e7b |
File details
Details for the file pandas_confusion-0.0.6-py2.py3-none-any.whl
.
File metadata
- Download URL: pandas_confusion-0.0.6-py2.py3-none-any.whl
- Upload date:
- Size: 16.9 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5b5e03191fc7207a74175436f85b8402480be6b7b627429a034767e5ba9f5206 |
|
MD5 | be7ab7c1b7e2e0aa0a9b81edeb5e3239 |
|
BLAKE2b-256 | 008283c2d82e76280d4f27e2bc56cd27e2fce759f77fb505ecf681a580965ba4 |