Because your y_train
is (301, 1)
and not (301,)
numpy does broadcasting, so
(y_train == model.predict(X_train)).shape == (301, 301)
which is not what you intended. The correct version of your code would be
np.mean(y_train.ravel() == model.predict(X_train))
which will give the same result as
model.score(X_train, y_train)
I have instantiated a SVC object using the sklearn library with the following code:,machine-learningpythonscikit-learnsvm,Python – What’s the difference between lists and tuples,Python – Difference between staticmethod and classmethod
Because your y_train
is (301, 1)
and not (301,)
numpy does broadcasting, so
(y_train == model.predict(X_train)).shape == (301, 301)
which is not what you intended. The correct version of your code would be
np.mean(y_train.ravel() == model.predict(X_train))
which will give the same result as
model.score(X_train, y_train)
model.predict() : given a trained model, predict the label of a new set of data. This method accepts one argument, the new data X_new (e.g. model.predict(X_new)), and returns the learned label for each object in the array.,model.transform() : given an unsupervised model, transform new data into the new basis. This also accepts one argument X_new, and returns the new representation of the data based on the unsupervised model.,model.fit() : fit training data. For supervised learning applications, this accepts two arguments: the data X and the labels y (e.g. model.fit(X, y)). For unsupervised learning applications, this accepts only a single argument, the data X (e.g. model.fit(X)).,Learning the parameters of a prediction function and testing it on the same data is a methodological mistake: a model that would just repeat the labels of the samples that it has just seen would have a perfect score but would fail to predict anything useful on yet-unseen data.
>>> from sklearn.datasets
import load_iris
>>>
iris = load_iris()
>>> print(iris.data.shape) (150, 4) >>> n_samples, n_features = iris.data.shape >>> print(n_samples) 150 >>> print(n_features) 4 >>> print(iris.data[0])[5.1 3.5 1.4 0.2]
>>> print(iris.target.shape)
(150, ) >>>
print(iris.target)[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2]
>>> print(iris.target_names)['setosa'
'versicolor'
'virginica']
>>> from sklearn.linear_model
import LinearRegression
>>> model = LinearRegression(normalize = True) >>> print(model.normalize) True >>> print(model) LinearRegression(copy_X = True, fit_intercept = True, n_jobs = 1, normalize = True)