Introduction to Regression & Classification

Note: This notebook requires GraphLab Create 1.2 or higher.

Creating regression models is easy with GraphLab Create! The regression/classification toolkit contains several models including (but not restricted to) linear regression, logistic regression, and gradient boosted trees. All models are built to work with millions of features and billions of examples. The models differ in how they make predictions, but conform to the same API. Like all GraphLab Create toolkits, you can call create() to create a model, predict() to make predictions on the returned model, and evaluate() to measure performance of the predictions.

Be sure to check out our notebook on feature engineering which discusses advanced features including the use of categorical variables, dictionary features, and text data. All of these feature types are easy and intuitive to use with GraphLab Create.

Overview

In this notebook, we will go over how GraphLab Create can be used for basic tasks in regression analysis. Specifically, we will go over:

We will start by importing GraphLab Create!


In [1]:
import graphlab as gl
gl.canvas.set_target('ipynb')

Data Overview

In this notebook, we will use a subset of the data from the Yelp Dataset Challenge for this tutorial. The task is to predict the 'star rating' for a restaurant for a given user. The dataset comprises three tables that cover 11,537 businesses, 8,282 check-ins, 43,873 users, and 229,907 reviews. The entire dataset as well as details about the dataset are available on the Yelp website.

Review Data

The review table includes information about each review. Specifically, it contains:

  • business_id: An encrypted business ID for the business being reviewed.
  • user_id: An encrypted user ID for the user who provided the review.
  • stars: A star rating (on a scale of 1-5)
  • text: The raw review text.
  • date: Date, formatted like '2012-03-14'
  • votes: The number of 'useful', 'funny' or 'cool' votes provided by other users for this review.

User Data

The user table consists of details about each user:

  • user_id: The encrypted user ID (cross referenced in the Review table)
  • name: First name
  • review_count: Total number of reviews made by the user.
  • average_stars: Average rating (on a scale of 1-5) made by the user.
  • votes: For each review type i.e ('useful', 'funny', 'cool') the total number of votes for reviews made by this user.

Business Data

The business table contains details about each business:

  • business_id: Encrypted business ID (cross referenced in the Review table)
  • name: Business name.
  • neighborhoods: Neighborhoods served by the business.
  • full_address: Address (text format)
  • city: City where the business is located.
  • state: State where the business is located.
  • latitude: Latitude of the business.
  • longitude: Longitude of the business.
  • stars: A star rating (rounded to half-stars) for this business.
  • review_count: The total number of reviews about this business.
  • categories: Category tags for this business.
  • open: Is this business still open? (True/False)

Let us take a closer look at the data.


In [2]:
business = gl.SFrame('https://static.turi.com/datasets/regression/business.csv')
user = gl.SFrame('https://static.turi.com/datasets/regression/user.csv')
review = gl.SFrame('https://static.turi.com/datasets/regression/review.csv')


[INFO] 1446791576 : INFO:     (initialize_globals_from_environment:282): Setting configuration variable GRAPHLAB_FILEIO_ALTERNATIVE_SSL_CERT_FILE to /Users/roman/miniconda3/envs/rc17-conda/lib/python2.7/site-packages/certifi/cacert.pem
1446791576 : INFO:     (initialize_globals_from_environment:282): Setting configuration variable GRAPHLAB_FILEIO_ALTERNATIVE_SSL_CERT_DIR to 
This commercial license of GraphLab Create is assigned to engr@turi.com.

[INFO] Start server at: ipc:///tmp/graphlab_server-83612 - Server binary: /Users/roman/miniconda3/envs/rc17-conda/lib/python2.7/site-packages/graphlab/unity_server - Server log: /tmp/graphlab_server_1446791576.log
[INFO] GraphLab Server Version: 1.6.908
PROGRESS: Downloading https://static.turi.com/datasets/regression/business.csv to /var/tmp/graphlab-roman/83612/1f7c7329-3886-46eb-9956-da43084c959b.csv
PROGRESS: Finished parsing file https://static.turi.com/datasets/regression/business.csv
PROGRESS: Parsing completed. Parsed 100 lines in 0.069744 secs.
PROGRESS: Finished parsing file https://static.turi.com/datasets/regression/business.csv
PROGRESS: Parsing completed. Parsed 11537 lines in 0.065551 secs.
PROGRESS: Downloading https://static.turi.com/datasets/regression/user.csv to /var/tmp/graphlab-roman/83612/1d0f9518-b850-481e-8a8f-cd7bf7a6b6e2.csv
PROGRESS: Finished parsing file https://static.turi.com/datasets/regression/user.csv
PROGRESS: Parsing completed. Parsed 100 lines in 0.080215 secs.
------------------------------------------------------
Inferred types from first line of file as 
column_type_hints=[str,list,str,str,float,float,str,int,int,float,str,str]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
------------------------------------------------------
PROGRESS: Finished parsing file https://static.turi.com/datasets/regression/user.csv
PROGRESS: Parsing completed. Parsed 43873 lines in 0.088443 secs.
PROGRESS: Downloading https://static.turi.com/datasets/regression/review.csv to /var/tmp/graphlab-roman/83612/5d90f57f-3390-46b1-9451-58d73f9114e6.csv
PROGRESS: Finished parsing file https://static.turi.com/datasets/regression/review.csv
PROGRESS: Parsing completed. Parsed 100 lines in 0.804394 secs.
Inferred types from first line of file as 
column_type_hints=[float,str,int,str,str,int,int,int]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------
------------------------------------------------------
PROGRESS: Read 61212 lines. Lines per second: 49268.1
PROGRESS: Finished parsing file https://static.turi.com/datasets/regression/review.csv
PROGRESS: Parsing completed. Parsed 229907 lines in 3.22541 secs.
Inferred types from first line of file as 
column_type_hints=[str,str,str,int,str,str,str,dict,int,int,int]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------

The schema and the first few entries of the review are shown below. For the sake of brevity, we will skip the business and user tables.


In [3]:
review.show()


Preparing the data

In this section, we will go through some basic steps to prepare the dataset for regression models.

First, we use an SFrame join operation to merge the business and review tables, using the business_id column to "match" the rows of the two tables. The output of the join is a single table with both business and review information. For clarity we rename some of the business columns to have more meaningful descriptions.


In [4]:
review_business_table = review.join(business, how='inner', on='business_id')
review_business_table = review_business_table.rename({'stars.1': 'business_avg_stars', 
                              'type.1': 'business_type',
                              'review_count': 'business_review_count'})

Now, join user table to the result, using the user_id column to match rows. Now we have review, business, and user information in a single table.


In [5]:
user_business_review_table = review_business_table.join(user, how='inner', on="user_id")
user_business_review_table = user_business_review_table.rename({'name.1': 'user_name', 
                                   'type.1': 'user_type', 
                                   'average_stars': 'user_avg_stars',
                                   'review_count': 'user_review_count'})

Now we're good to go! Let's take a look at what the final dataset looks like:


In [6]:
user_business_review_table.head(5)


Out[6]:
business_id date review_id stars text type
9yKzy9PApeiPPOUJEtnvkg 2011-01-26 fWKvX83p0-ka4JS3dc6E5A 5 My wife took me here on
my birthday for break ...
review
ZRJwVLyzEJq1VAihDhYiow 2011-07-27 IjZ33sJrzXqU-0X6U8NwyA 5 I have no idea why some
people give bad reviews ...
review
6oRAC4uyJCsJl1X0WZpVSA 2012-06-14 IESLBzqUCLdSzSqm0eCSxQ 4 love the gyro plate. Rice
is so good and I also ...
review
_1QQZuf4zZOyFCvXc0o6Vg 2010-05-27 G-WvGaISbqqaMHlNnByodA 5 Rosie, Dakota, and I LOVE
Chaparral Dog Park!!! ...
review
6ozycU1RpktNG2-1BroVtw 2012-01-05 1uJFq2r5QfJG_6ExMRCaGw 5 General Manager Scott
Petello is a good egg!!! ...
review
user_id votes year month day categories city
rLtl8ZkDX5vH5nAx9C3q5Q {'funny': 0, 'useful': 5,
'cool': 2} ...
2011 1 26 [Breakfast & Brunch,
Restaurants] ...
Phoenix
0a2KyEL0d3Yb1V6aivbIuQ {'funny': 0, 'useful': 0,
'cool': 0} ...
2011 7 27 [Italian, Pizza,
Restaurants] ...
Phoenix
0hT2KtfLiobPvh6cDC8JQg {'funny': 0, 'useful': 1,
'cool': 0} ...
2012 6 14 [Middle Eastern,
Restaurants] ...
Tempe
uZetl9T0NcROGOyFfughhg {'funny': 0, 'useful': 2,
'cool': 1} ...
2010 5 27 [Active Life, Dog Parks,
Parks] ...
Scottsdale
vYmM4KTsC8ZfQBg-j5MWkw {'funny': 0, 'useful': 0,
'cool': 0} ...
2012 1 5 [Tires, Automotive] Mesa
full_address latitude longitude name open business_review_count business_avg_stars
6106 S 32nd St\nPhoenix,
AZ 85042 ...
33.3908 -112.013 Morning Glory Cafe 1 116 4.0
4848 E Chandler
Blvd\nPhoenix, AZ 85044 ...
33.3056 -111.979 Spinato's Pizzeria 1 102 4.0
1513 E Apache
Blvd\nTempe, AZ 85281 ...
33.4143 -111.913 Haji-Baba 1 265 4.5
5401 N Hayden
Rd\nScottsdale, AZ 85250 ...
33.5229 -111.908 Chaparral Dog Park 1 88 4.5
1357 S Power Road\nMesa,
AZ 85206 ...
33.391 -111.684 Discount Tire 1 5 4.5
state business_type user_avg_stars user_name user_review_count user_type votes_funny votes_cool votes_useful
AZ business 3.72 Jason 376 user 331 322 1034
AZ business 5.0 Paul 2 user 2 0 0
AZ business 4.33 Nicole 3 user 0 0 3
AZ business 4.29 lindsey 31 user 18 36 75
AZ business 3.25 Roger 28 user 3 8 32
[5 rows x 29 columns]

Training, Predicting, and Evaluating Models

It's now time to do some data science! First, let us split our data into training and testing sets, using SFrame's random_split function.


In [7]:
train_set, test_set = user_business_review_table.random_split(0.8, seed=1)

Let's start out with a simple model. The target is the star rating for each review and the features are:

  • Average rating of a given business
  • Average rating made by a user
  • Number of reviews made by a user
  • Number of reviews that concern a business

In [8]:
model = gl.linear_regression.create(train_set, target='stars', 
                                    features = ['user_avg_stars','business_avg_stars', 
                                                'user_review_count', 'business_review_count'])


PROGRESS: Linear regression:
PROGRESS: --------------------------------------------------------
PROGRESS: Number of examples          : 163976
PROGRESS: Number of features          : 4
PROGRESS: Number of unpacked features : 4
PROGRESS: Number of coefficients    : 5
PROGRESS: Starting Newton Method
PROGRESS: --------------------------------------------------------
PROGRESS: +-----------+----------+--------------+--------------------+----------------------+---------------+-----------------+
PROGRESS: | Iteration | Passes   | Elapsed Time | Training-max_error | Validation-max_error | Training-rmse | Validation-rmse |
PROGRESS: +-----------+----------+--------------+--------------------+----------------------+---------------+-----------------+
PROGRESS: | 1         | 2        | 1.061012     | 3.974238           | 3.731815             | 0.971716      | 0.953531        |
PROGRESS: +-----------+----------+--------------+--------------------+----------------------+---------------+-----------------+
PROGRESS: SUCCESS: Optimal solution found.
PROGRESS:
PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

Much of the summary output is self-explanatory. We will explain below what the terms 'coefficients' and 'errors' mean.

Making Predictions

GraphLab Create easily allows you to make predictions using the created model with the predict function. The predict function returns an SArray with a prediction for each example in the test dataset.


In [9]:
predictions = model.predict(test_set)
predictions.head(5)


Out[9]:
dtype: float
Rows: 5
[3.008125182034785, 4.6755074163872745, 4.607010044148817, 3.767029472049021, 4.81813909560203]

Evaluating Results

We can also evaluate our predictions by comparing them to known ratings. The results are evaluated using two metrics: root-mean-square error (RMSE) is a global summary of the differences between predicted values and the values actually observed, while max-error measures the worst case performance of the model on a single observation. In this example, our model made predictions which were about 1 star away from the true rating (on average) but there were a few cases where we were off by almost 4 stars.


In [10]:
model.evaluate(test_set)


Out[10]:
{'max_error': 4.0190816636474285, 'rmse': 0.9710269452058631}

Let's go further in analyzing how well our model performed at predicting ratings. We perform a groupby-aggregate to calculate the average predicted rating (on the test set) for each value of the actual rating (1-5). This will help us understand when the model performs well and when it does not.


In [11]:
sf = gl.SFrame()
sf['Predicted-Rating'] = predictions
sf['Actual-Rating'] = test_set['stars']
predict_count = sf.groupby('Actual-Rating', [gl.aggregate.COUNT('Actual-Rating'), gl.aggregate.AVG('Predicted-Rating')])
predict_count.topk('Actual-Rating', k=5, reverse=True)


Out[11]:
Actual-Rating Count Avg of Predicted-Rating
1 3280 2.64863703655
2 4003 3.27077094185
3 6455 3.5406504061
4 15150 3.815774915
5 14383 4.23685172744
[5 rows x 3 columns]

It looks like our model does well on ratings that were between 3 and 5 but not too well on ratings 1 and 2. One reason why this could happen is that we have a lot more reviews with 4 and 5 star ratings. In fact, the number of 4 and 5 star reviews is more than twice the number of reviews with 1-3 stars.

Interpreting Results

In addition to making predictions about new data, GraphLab's regression toolkit can provide valuable insight about the relationships between the target and feature columns in your data, revealing why your model returns the predictions that it does. Let's briefly venture into some mathematical details to explain. Linear regression models the target $Y$ as a linear combination of the feature variables $X_j$, random noise $\epsilon$, and a bias term ($\alpha_0$) (also known as the intercept or global offset):

$$Y = \alpha_0 + \sum_{j} \alpha_j X_j + \epsilon$$

The coefficients ($\alpha_j$) are what the training procedure learns. Each model coefficient describes the expected change in the target variable associated with a unit change in the feature. The bias term indicates the "inherent" or "average" target value if all feature values were set to zero.

The coefficients often tell an interesting story of how much each feature matters in predicting target values. The magnitude (absolute value) of the coefficient for each feature indicates the strength of the feature's association to the target variable, holding all other features constant. The sign on the coefficient (positive or negative) gives the direction of the association.

For a trained model, we can access the coefficients as follows. The name is the name of the feature, the index refers to a category for categorical variables, and the value is the value of the coefficient.


In [12]:
coefs = model['coefficients']
coefs


Out[12]:
name index value
(intercept) None -2.22960549033
user_avg_stars None 0.810357163135
business_avg_stars None 0.781279217743
user_review_count None 1.97346316063e-05
business_review_count None 5.31674540185e-05
[5 rows x 3 columns]

Not surpisingly, high ratings are associated with (i) users who give a lot of high ratings on average, and (ii) businesses that receive high ratings on average. More interestingly, the number of reviews submitted by a user or recieved by a business appears to have a very weak association with ratings.

Binary Classification

Logistic regression is a model that is popularly used for classification tasks. In logistic regression, the probability that a binary target is True is modeled as a logistic function of the features.

First, let's construct a binary target variable. In this example, we will predict if a restaurant is good or bad, with 1 and 2 star ratings indicating a bad business and 3-5 star ratings indicating a good one.


In [13]:
user_business_review_table['is_good'] = user_business_review_table['stars'] >= 3

First, let's create a train-test split:


In [14]:
train_set, test_set = user_business_review_table.random_split(0.8, seed=1)

We will use the same set of features that we used for the linear regression model. Note that the API is very similar to the linear regression API.


In [15]:
model = gl.logistic_classifier.create(train_set, target="is_good", 
                                      features = ['user_avg_stars','business_avg_stars', 
                                                'user_review_count', 'business_review_count'])


PROGRESS: Logistic regression:
PROGRESS: --------------------------------------------------------
PROGRESS: Number of examples          : 163868
PROGRESS: Number of classes           : 2
PROGRESS: Number of feature columns   : 4
PROGRESS: Number of unpacked features : 4
PROGRESS: Number of coefficients    : 5
PROGRESS: Starting Newton Method
PROGRESS: --------------------------------------------------------
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: | Iteration | Passes   | Elapsed Time | Training-accuracy | Validation-accuracy |
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: | 1         | 2        | 0.265869     | 0.863842          | 0.863730            |
PROGRESS: | 2         | 3        | 0.422368     | 0.866545          | 0.867620            |
PROGRESS: | 3         | 4        | 0.578074     | 0.866807          | 0.867963            |
PROGRESS: | 4         | 5        | 0.733239     | 0.867009          | 0.868192            |
PROGRESS: | 5         | 6        | 0.880150     | 0.867009          | 0.868192            |
PROGRESS: | 6         | 7        | 1.030508     | 0.867009          | 0.868192            |
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.

PROGRESS: SUCCESS: Optimal solution found.
PROGRESS:

Making Predictions (Probabilities, Classes, or Margins)

Logistic regression predictions can take one of three forms:

  • Classes (default) : Thresholds the probability estimate at 0.5 to predict a class label i.e. True/False.
  • Probabilities : A probability estimate (in the range [0,1]) that the example is in the True class.
  • Margins : Distance to the linear decision boundary learned by the model. The larger the distance, the more confidence we have that it belongs to one class or the other.

GraphLab's logistic regression model can return predictions for any of these types:


In [16]:
# Probability
predictions = model.predict(test_set)
predictions.head(5)


Out[16]:
dtype: int
Rows: 5
[1, 1, 1, 1, 1]

In [17]:
predictions = model.predict(test_set, output_type = "margin")
predictions.head(5)


Out[17]:
dtype: float
Rows: 5
[0.4689005631839578, 3.9376949950820297, 3.8817190122552834, 2.206682186278641, 4.370152167477832]

In [18]:
predictions = model.predict(test_set, output_type = "probability")
predictions.head(5)


Out[18]:
dtype: float
Rows: 5
[0.6151235014526506, 0.9808796206839487, 0.9798010518655392, 0.9008479705745347, 0.9875086909040025]

Evaluating Results

We can evaluate our predictions by comparing them to known ratings. The results are evaluated using two metrics:


In [19]:
result = model.evaluate(test_set)
print "Accuracy         : %s" % result['accuracy']
print "Confusion Matrix : \n%s" % result['confusion_matrix']


Accuracy         : 0.865036629613
Confusion Matrix : 
+--------------+-----------------+-------+
| target_label | predicted_label | count |
+--------------+-----------------+-------+
|      0       |        0        |  2379 |
|      0       |        1        |  4904 |
|      1       |        1        | 35052 |
|      1       |        0        |  936  |
+--------------+-----------------+-------+
[4 rows x 3 columns]

GraphLab Create's evaluation toolkit contains more detail on evaluation metrics for both regression and classification. You are now good to go with regression! Be sure to check out our notebook on feature engineering to learn new tricks that can help you make better classifiers and predictors!

Multiclass Classification

Logistic Regression can also be used for multiclass classficiation. Multiclass classification allows each observation to be assigned to one of many categories (for example: ratings may be 1, 2, 3, 4, or 5). In this example, we will predict the rating of the restaurant.


In [20]:
model = gl.logistic_classifier.create(train_set, target="stars", 
                                      features = ['user_avg_stars','business_avg_stars', 
                                                'user_review_count', 'business_review_count'])


PROGRESS: Logistic regression:
PROGRESS: --------------------------------------------------------
PROGRESS: Number of examples          : 163993
PROGRESS: Number of classes           : 5
PROGRESS: Number of feature columns   : 4
PROGRESS: Number of unpacked features : 4
PROGRESS: Number of coefficients    : 20
PROGRESS: Starting Newton Method
PROGRESS: --------------------------------------------------------
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: | Iteration | Passes   | Elapsed Time | Training-accuracy | Validation-accuracy |
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: | 1         | 2        | 0.385790     | 0.450105          | 0.453976            |
PROGRESS: | 2         | 3        | 0.636195     | 0.476514          | 0.475566            |
PROGRESS: | 3         | 4        | 0.899109     | 0.476819          | 0.476146            |
PROGRESS: | 4         | 5        | 1.163658     | 0.476874          | 0.475914            |
PROGRESS: | 5         | 6        | 1.430248     | 0.476850          | 0.475914            |
PROGRESS: | 6         | 7        | 1.688117     | 0.476850          | 0.475914            |
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS:

Statistics about the training data including the number of classes, the set of classes registered in the dataset, as well as the number of examples in each class are stored in the model.


In [21]:
print "This model has %s classes" % model['num_classes']
print "The set of classes in the training set are %s" % model['classes']


This model has 5 classes
The set of classes in the training set are [1, 2, 3, 4, 5]

Top-k predictions with multiclass classfication

While training models for multiclass classification, the top-k predictions can be of the following type.

  • Probabilities (default): A probability estimate (in the range [0,1]) that the example is in the predicted class.
  • Margins : A score that reflects the confidence we have that the example belongs to the predicted class. The larger the score, the greater the confidence.
  • Rank : A rank (from 1-k) that the example belongs to the predicted class.

In the following example, we calculate the top-2 probabilities, margins, and ranks of predictions.


In [22]:
predictions = model.predict_topk(test_set, output_type = 'probability', k = 2)
predictions.head(5)


Out[22]:
id class probability
0 4 0.290599734464
0 3 0.245391286641
1 5 0.708112474661
1 4 0.244961580442
2 5 0.673094718076
[5 rows x 3 columns]


In [23]:
predictions = model.predict_topk(test_set, output_type = 'margin', k = 2)
predictions.head(5)


Out[23]:
id class margin
0 4 0.43957010408
0 3 0.27047729158
1 5 5.9099428457
1 4 4.84844128583
2 5 5.64011549718
[5 rows x 3 columns]


In [24]:
predictions = model.predict_topk(test_set, output_type = 'rank', k = 2)
predictions.head(5)


Out[24]:
id class rank
0 4 0
0 3 1
1 5 0
1 4 1
2 5 0
[5 rows x 3 columns]

Evaluation

Similar to binary classifications, we can evaluate our predictions by comparing them to known ratings. The results are evaluated using two metrics:


In [25]:
result = model.evaluate(test_set)
print "Confusion Matrix : \n%s" % result['confusion_matrix']


Confusion Matrix : 
+--------------+-----------------+-------+
| target_label | predicted_label | count |
+--------------+-----------------+-------+
|      1       |        3        |  127  |
|      4       |        2        |   18  |
|      4       |        1        |  391  |
|      2       |        1        |  778  |
|      1       |        4        |  1374 |
|      5       |        2        |   6   |
|      1       |        1        |  1604 |
|      2       |        4        |  2620 |
|      3       |        2        |   10  |
|      5       |        3        |   44  |
+--------------+-----------------+-------+
[25 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

Imbalanced Datasets

Many difficult real-world problems have imbalanced data, where at least one class is under-represented. GraphLab Create models can improve prediction quality for some unbalanced scenarios by assigning different costs to misclassification errors for different classes.

Let us see the distribution of examples for each class in the dataset.


In [26]:
review['stars'].astype(str).show()



In [27]:
model = gl.logistic_classifier.create(train_set, target="stars", 
                                      features = ['user_avg_stars','business_avg_stars', 
                                                'user_review_count', 'business_review_count'], 
                                      class_weights = 'auto')


PROGRESS: Logistic regression:
PROGRESS: --------------------------------------------------------
PROGRESS: Number of examples          : 164044
PROGRESS: Number of classes           : 5
PROGRESS: Number of feature columns   : 4
PROGRESS: Number of unpacked features : 4
PROGRESS: Number of coefficients    : 20
PROGRESS: Starting Newton Method
PROGRESS: --------------------------------------------------------
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: | Iteration | Passes   | Elapsed Time | Training-accuracy | Validation-accuracy |
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: | 1         | 2        | 0.336499     | 0.381794          | 0.383699            |
PROGRESS: | 2         | 3        | 0.582315     | 0.419692          | 0.416978            |
PROGRESS: | 3         | 4        | 0.834512     | 0.425764          | 0.426553            |
PROGRESS: | 4         | 5        | 1.088540     | 0.426422          | 0.429472            |
PROGRESS: | 5         | 6        | 1.328803     | 0.426477          | 0.429239            |
PROGRESS: +-----------+----------+--------------+-------------------+---------------------+
PROGRESS: SUCCESS: Optimal solution found.

In [28]:
result = model.evaluate(test_set)
print "Confusion Matrix : \n%s" % result['confusion_matrix']


Confusion Matrix : 
+--------------+-----------------+-------+
| target_label | predicted_label | count |
+--------------+-----------------+-------+
|      5       |        3        |  821  |
|      1       |        3        |  263  |
|      3       |        2        |  1448 |
|      2       |        4        |  726  |
|      1       |        1        |  1971 |
|      5       |        4        |  2899 |
|      2       |        2        |  1007 |
|      2       |        3        |  676  |
|      5       |        2        |  963  |
|      1       |        2        |  616  |
+--------------+-----------------+-------+
[25 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.