You can then capture this data in Python by creating pandas DataFrame using this code:
import pandas as pd
data = {
'y_Actual': [1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0],
'y_Predicted': [1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0]
}
df = pd.DataFrame(data, columns = ['y_Actual', 'y_Predicted'])
print(df)
To create the Confusion Matrix using pandas, you’ll need to apply the pd.crosstab as follows:
confusion_matrix = pd.crosstab(df['y_Actual'], df['y_Predicted'], rownames = ['Actual'], colnames = ['Predicted'])
print(confusion_matrix)
And here is the full Python code to create the Confusion Matrix:
import pandas as pd
data = {
'y_Actual': [1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0],
'y_Predicted': [1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0]
}
df = pd.DataFrame(data, columns = ['y_Actual', 'y_Predicted'])
confusion_matrix = pd.crosstab(df['y_Actual'], df['y_Predicted'], rownames = ['Actual'], colnames = ['Predicted'])
print(confusion_matrix)
So your Python code would look like this:
import pandas as pd
import seaborn as sn
import matplotlib.pyplot as plt
data = {
'y_Actual': [1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 1, 0],
'y_Predicted': [1, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0]
}
df = pd.DataFrame(data, columns = ['y_Actual', 'y_Predicted'])
confusion_matrix = pd.crosstab(df['y_Actual'], df['y_Predicted'], rownames = ['Actual'], colnames = ['Predicted'], margins = True)
sn.heatmap(confusion_matrix, annot = True)
plt.show()
You may print additional stats (such as the Accuracy) using the pandas_ml package in Python. You can install the pandas_ml package by using PIP:
pip install pandas_ml
A confusion matrix is a tabular summary of the number of correct and incorrect predictions made by a classifier. It can be used to evaluate the performance of a classification model through the calculation of performance metrics like accuracy, precision, recall, and F1-score.,metrics.classification_report() takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates performance metrics like precision, recall, and support.,metrics.confusion_matrix() takes in the list of actual labels, the list of predicted labels, and an optional argument to specify the order of the labels. It calculates the confusion matrix for the given inputs.,The following code snippet shows how to create a confusion matrix and calculate some important metrics using a Python library called scikit-learn (also known as sklearn):
# Importing the dependancies from sklearn import metrics # Predicted values y_pred = ["a", "b", "c", "a", "b"] # Actual values y_act = ["a", "b", "c", "c", "a"] # Printing the confusion matrix # The columns will show the instances predicted for each label, # and the rows will show the actual number of instances for each label. print(metrics.confusion_matrix(y_act, y_pred, labels = ["a", "b", "c"])) # Printing the precision and recall, among other metrics print(metrics.classification_report(y_act, y_pred, labels = ["a", "b", "c" ]))
Edited ( December 16, 2020 ) Edit
Create a datafrrame
import pandas as pd
data = {
'prediction': ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'],
'actual': ['a', 'a', 'b', 'b', 'b', 'b', 'b', 'c', 'c']
}
df = pd.DataFrame(data)
print(df)
prediction actual
0 a a
1 a a
2 a b
3 b b
4 b b
5 b b
6 c b
7 c c
8 c c
Create a confusion table
contingency_matrix = pd.crosstab(df['prediction'], df['actual'])
print(contingency_matrix)
actual a b c
prediction
a 2 1 0
b 0 3 0
c 0 1 2
Plot the confusion table
import matplotlib.pyplot as plt
import seaborn as sn
plt.clf()
ax = fig.add_subplot(111)
ax.set_aspect(1)
res = sn.heatmap(contingency_matrix.T, annot = True, fmt = '.2f', cmap = "YlGnBu", cbar = False)
plt.savefig("crosstab_pandas.png", bbox_inches = 'tight', dpi = 100)
plt.show()
By Bernd Klein. Last modified: 05 Jul 2022., 17 Oct 2022 to 21 Oct 2022
import numpy as np
cm = np.array(
[
[5825, 1, 49, 23, 7, 46, 30, 12, 21, 26],
[1, 6654, 48, 25, 10, 32, 19, 62, 111, 10],
[2, 20, 5561, 69, 13, 10, 2, 45, 18, 2],
[6, 26, 99, 5786, 5, 111, 1, 41, 110, 79],
[4, 10, 43, 6, 5533, 32, 11, 53, 34, 79],
[3, 1, 2, 56, 0, 4954, 23, 0, 12, 5],
[31, 4, 42, 22, 45, 103, 5806, 3, 34, 3],
[0, 4, 30, 29, 5, 6, 0, 5817, 2, 28],
[35, 6, 63, 58, 8, 59, 26, 13, 5394, 24],
[16, 16, 21, 57, 216, 68, 0, 219, 115, 5693]
])
def precision(label, confusion_matrix):
col = confusion_matrix[: , label]
return confusion_matrix[label, label] / col.sum()
def recall(label, confusion_matrix):
row = confusion_matrix[label,: ]
return confusion_matrix[label, label] / row.sum()
def precision_macro_average(confusion_matrix):
rows, columns = confusion_matrix.shape
sum_of_precisions = 0
for label in range(rows):
sum_of_precisions += precision(label, confusion_matrix)
return sum_of_precisions / rows
def recall_macro_average(confusion_matrix):
rows, columns = confusion_matrix.shape
sum_of_recalls = 0
for label in range(columns):
sum_of_recalls += recall(label, confusion_matrix)
return sum_of_recalls / columns
print("label precision recall")
for label in range(10):
print(f "{label:5d} {precision(label, cm):9.3f} {recall(label, cm):6.3f}")
label precision recall
0 0.983 0.964
1 0.987 0.954
2 0.933 0.968
3 0.944 0.924
4 0.947 0.953
5 0.914 0.980
6 0.981 0.953
7 0.928 0.982
8 0.922 0.949
9 0.957 0.887
print("precision total:", precision_macro_average(cm))
print("recall total:", recall_macro_average(cm))
precision total: 0.9496885564052286 recall total: 0.9514531547877969