try:
from sklearn
import preprocessing
y = preprocessing.label_binarize(y, classes = [0, 1, 2, 3])
As it has been pointed out, you must first binarize y
y = label_binarize(y, classes = [0, 1, 2, 3])
and then use a multiclass learning algorithm like OneVsRestClassifier
or OneVsOneClassifier
. For example:
clf_SVM = OneVsRestClassifier(LinearSVC())
params = {
'estimator__C': [0.5, 1.0, 1.5],
'estimator__tol': [1e-3, 1e-4, 1e-5],
}
gs = GridSearchCV(clf_SVM, params, cv = 5, scoring = 'roc_auc')
gs.fit(corpus1, y)
Suggestion : 2
May 23, 2022 python, roc, scikit-learn, spyder No comments
This is my code
from sklearn.tree
import DecisionTreeClassifier
from sklearn.ensemble
import BaggingClassifier
from sklearn.metrics
import confusion_matrix, zero_one_loss
from sklearn.metrics
import classification_report, matthews_corrcoef, accuracy_score
from sklearn.metrics
import roc_auc_score, auc
dtc = DecisionTreeClassifier()
bc = BaggingClassifier(base_estimator = dtc, n_estimators = 10, random_state = 17)
bc.fit(train_x, train_Y)
pred_y = bc.predict(test_x)
fprate, tprate, thresholds = roc_curve(test_Y, pred_y)
results = confusion_matrix(test_Y, pred_y)
error = zero_one_loss(test_Y, pred_y)
roc_auc_score(test_Y, pred_y)
FP = results.sum(axis = 0) - np.diag(results)
FN = results.sum(axis = 1) - np.diag(results)
TP = np.diag(results)
TN = results.sum() - (FP + FN + TP)
print('\n Time Processing: \n', time.process_time())
print('\n Confussion Matrix: \n', results)
print('\n Zero-one classification loss: \n', error)
print('\n True Positive: \n', TP)
print('\n True Negative: \n', TN)
print('\n False Positive: \n', FP)
print('\n False Negative: \n', FN)
print('\n The Classification report:\n', classification_report(test_Y, pred_y, digits = 6))
print('MCC:', matthews_corrcoef(test_Y, pred_y))
print('Accuracy:', accuracy_score(test_Y, pred_y))
print(auc(fprate, tprate))
print('ROC Score:', roc_auc_score(test_Y, pred_y))
Are your label classes (y) either 1
or 0
? If not, I think you have to add the pos_label
parameter to your roc_curve
call.
fprate, tprate, thresholds = roc_curve(test_Y, pred_y, pos_label = 'your_label')
Or:
test_Y = your_test_y_array # these are either 1 's or 0' s fprate, tprate, thresholds = roc_curve(test_Y, pred_y)
Suggestion : 3
How to fix ValueError: multiclass format is not supported
from sklearn
import preprocessing
y = preprocessing.label_binarize(y, classes = [0, 1, 2, 3])