Get some data to play with
In [ ]:
from sklearn.datasets import load_digits
digits = load_digits()
digits.keys()
In [ ]:
digits.images.shape
In [ ]:
print(digits.images[0])
In [ ]:
import matplotlib.pyplot as plt
%matplotlib notebook
plt.matshow(digits.images[0], cmap=plt.cm.Greys)
In [ ]:
digits.data.shape
In [ ]:
digits.target.shape
In [ ]:
digits.target
Data is always a numpy array (or sparse matrix) of shape (n_samples, n_features)
Split the data to get going
In [ ]:
from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(digits.data,
digits.target)
In [ ]:
from sklearn.svm import LinearSVC
1) Instantiate an object and set the parameters
In [ ]:
svm = LinearSVC(C=0.1)
2) Fit the model
In [ ]:
svm.fit(X_train, y_train)
3) Apply / evaluate
In [ ]:
print(svm.predict(X_train))
print(y_train)
In [ ]:
svm.score(X_train, y_train)
In [ ]:
svm.score(X_test, y_test)
In [ ]:
from sklearn.ensemble import RandomForestClassifier
In [ ]:
rf = RandomForestClassifier(n_estimators=50)
In [ ]:
rf.fit(X_train, y_train)
In [ ]:
rf.score(X_test, y_test)
In [ ]: