Cross-Validation


In [1]:
from sklearn.datasets import load_digits

In [2]:
digits = load_digits()
X = digits.data
y = digits.target

In [3]:
from sklearn.cross_validation import cross_val_score
from sklearn.svm import LinearSVC

In [6]:
cross_val_score(LinearSVC(), X, y)


Out[6]:
array([ 0.88538206,  0.93989983,  0.88758389])

In [7]:
cross_val_score(LinearSVC(), X, y, cv=5, scoring="f1_macro")


Out[7]:
array([ 0.91827605,  0.87967486,  0.93430165,  0.95196865,  0.87637815])

Let's go to a binary task for a moment (even vs uneven)


In [8]:
y % 2


Out[8]:
array([0, 1, 0, ..., 0, 1, 0])

In [9]:
cross_val_score(LinearSVC(), X, y % 2, scoring="average_precision")


Out[9]:
array([ 0.97282063,  0.96601434,  0.95505131])

In [10]:
cross_val_score(LinearSVC(), X, y % 2, scoring="roc_auc")


Out[10]:
array([ 0.96991995,  0.96051018,  0.95219301])

There are other ways to do cross-valiation


In [11]:
from sklearn.cross_validation import ShuffleSplit
shuffle_split = ShuffleSplit(len(X), 10, test_size=.4)
cross_val_score(LinearSVC(), X, y, cv=shuffle_split)


Out[11]:
array([ 0.94297636,  0.92489569,  0.94436718,  0.94297636,  0.93602225,
        0.94436718,  0.94158554,  0.92350487,  0.94158554,  0.92350487])

In [ ]: