Playing around with digits dataset, which is one of the scikit-learn standard datasets. The dataset provides features that can be used to classify the digits samples.
In [4]:
from sklearn import datasets
digits = datasets.load_digits()
print(digits.data)
In [5]:
digits.target
Out[5]:
Try support vector classification estimator.
In [7]:
from sklearn import svm
clf = svm.SVC(gamma=0.001, C=100.)
clf.fit(digits.data[:-1], digits.target[:-1])
Out[7]:
Predict: what is the digit in the last image => 8.
In [12]:
clf.predict(digits.data[-1])
Out[12]:
Saving a model by using pickle, Python’s built-in persistence model.
In [13]:
import pickle
s = pickle.dumps(clf)
clf2 = pickle.loads(s)
clf2.predict(digits.data[-1])
Out[13]:
scikit also provides joblib, which can be used as a replacement of pickle (joblib.dump & joblib.load). It can only pickle to the disk and not to a string, but it is more efficient on big data.
In [14]:
from sklearn.externals import joblib
joblib.dump(clf, 'clfdump.pkl')
Out[14]: