pyLogit Example

The purpose of this notebook is to demonstrate they key functionalities of pyLogit:

  1. Converting data between 'wide' and 'long' formats.
  2. Estimating conditional logit models.

The dataset being used for this example is the "Swissmetro" dataset used in the Python Biogeme examples. The data can be downloaded at http://biogeme.epfl.ch/examples_swissmetro.html, and a detailed explanation of the variables and data-collection procedure can be found at http://www.strc.ch/conferences/2001/bierlaire1.pdf.

Relevant information about this dataset is that it is from a stated preference survey about whether or not individuals would use a new underground Magnetic-Levetation train system called the Swissmetro.

The overall set of possible choices in this dataset was "Train", "Swissmetro", and "Car." However, the choice set faced by each individual is not constant. An individual's choice set was partially based on the alternatives that he/she was capable of using at the moment. For instance, people who did not own cars did not receive a stated preference question where car was an alternative that they could choose. Note that because the choice set varies across choice situations, mlogit and statsmodels could not be used with this dataset.

Also, each individual responded to multiple choice situations. Thus the choice observations are not truly independent of all other choice observations (they are correlated accross choices made by the same individual). However, for the purposes of this example, the effect of repeat-observations on the typical i.i.d. assumptions will be ignored.

Based on the Swissmetro data, we will build a travel mode choice model for individuals who are commuting or going on a business trip.


In [1]:
from collections import OrderedDict    # For recording the model specification 

import pandas as pd                    # For file input/output
import numpy as np                     # For vectorized math operations

import pylogit as pl                   # For MNL model estimation and
                                       # conversion from wide to long format

Load and filter the raw Swiss Metro data


In [2]:
# Note that the .dat files used by python biogeme are tab delimited text files
wide_swiss_metro = pd.read_table("../data/swissmetro.dat", sep="\t")

# Select obervations whose choice is known (i.e. CHOICE != 0)
# **AND** whose PURPOSE is either 1 (commute) or 3 (business)
include_criteria = (wide_swiss_metro.PURPOSE.isin([1, 3]) &
                    (wide_swiss_metro.CHOICE != 0))
# Note that the .copy() ensures that any later changes are made 
# to a copy of the data and not to the original data
wide_swiss_metro = wide_swiss_metro.loc[include_criteria].copy()

In [3]:
# Look at the first 5 rows of the data
wide_swiss_metro.head().T


Out[3]:
0 1 2 3 4
GROUP 2 2 2 2 2
SURVEY 0 0 0 0 0
SP 1 1 1 1 1
ID 1 1 1 1 1
PURPOSE 1 1 1 1 1
FIRST 0 0 0 0 0
TICKET 1 1 1 1 1
WHO 1 1 1 1 1
LUGGAGE 0 0 0 0 0
AGE 3 3 3 3 3
MALE 0 0 0 0 0
INCOME 2 2 2 2 2
GA 0 0 0 0 0
ORIGIN 2 2 2 2 2
DEST 1 1 1 1 1
TRAIN_AV 1 1 1 1 1
CAR_AV 1 1 1 1 1
SM_AV 1 1 1 1 1
TRAIN_TT 112 103 130 103 130
TRAIN_CO 48 48 48 40 36
TRAIN_HE 120 30 60 30 60
SM_TT 63 60 67 63 63
SM_CO 52 49 58 52 42
SM_HE 20 10 30 20 20
SM_SEATS 0 0 0 0 0
CAR_TT 117 117 117 72 90
CAR_CO 65 84 52 52 84
CHOICE 2 2 2 2 2

Convert the Swissmetro data to "Long Format"

pyLogit only estimates models using data that is in "long" format.

Long format has 1 row per individual per available alternative, and wide format has 1 row per individual or observation. Long format is useful because it permits one to directly use matrix dot products to calculate the index, $V_{ij} = x_{ij} \beta$, for each individual $\left(i \right)$ for each alternative $\left(j \right)$. In applications where one creates one's own dataset, the dataset can usually be created in long format from the very beginning. However, in situations where a dataset is provided to you in wide format (as in the case of the Swiss Metro dataset), it will be necesssary to convert the data from wide format to long format.

To convert the raw swiss metro data to long format, we need to specify:

  1. the variables or columns that are specific to a given individual, regardless of what alternative is being considered (note: every row is being treated as a separate observation, even though each individual gave multiple responses in this stated preference survey)
  2. the variables that vary across some or all alternatives, for a given individual (e.g. travel time)
  3. the availability variables
  4. the unique observation id column. (Note this dataset has an observation id column, but for the purposes of this example we don't want to consider the repeated observations of each person as being related. We therefore want a identifying column that gives an id to every response of every individual instead of to every individual).
  5. the choice column

The cells below will identify these various columns, give them names in the long-format data, and perform the necessary conversion.


In [4]:
# Look at the columns of the swiss metro dataset
wide_swiss_metro.columns


Out[4]:
Index([u'GROUP', u'SURVEY', u'SP', u'ID', u'PURPOSE', u'FIRST', u'TICKET',
       u'WHO', u'LUGGAGE', u'AGE', u'MALE', u'INCOME', u'GA', u'ORIGIN',
       u'DEST', u'TRAIN_AV', u'CAR_AV', u'SM_AV', u'TRAIN_TT', u'TRAIN_CO',
       u'TRAIN_HE', u'SM_TT', u'SM_CO', u'SM_HE', u'SM_SEATS', u'CAR_TT',
       u'CAR_CO', u'CHOICE'],
      dtype='object')

In [5]:
# Create the list of individual specific variables
ind_variables = wide_swiss_metro.columns.tolist()[:15]

# Specify the variables that vary across individuals and some or all alternatives
# The keys are the column names that will be used in the long format dataframe.
# The values are dictionaries whose key-value pairs are the alternative id and
# the column name of the corresponding column that encodes that variable for
# the given alternative. Examples below.
alt_varying_variables = {u'travel_time': dict([(1, 'TRAIN_TT'),
                                               (2, 'SM_TT'),
                                               (3, 'CAR_TT')]),
                          u'travel_cost': dict([(1, 'TRAIN_CO'),
                                                (2, 'SM_CO'),
                                                (3, 'CAR_CO')]),
                          u'headway': dict([(1, 'TRAIN_HE'),
                                            (2, 'SM_HE')]),
                          u'seat_configuration': dict([(2, "SM_SEATS")])}

# Specify the availability variables
# Note that the keys of the dictionary are the alternative id's.
# The values are the columns denoting the availability for the
# given mode in the dataset.
availability_variables = {1: 'TRAIN_AV',
                          2: 'SM_AV', 
                          3: 'CAR_AV'}

##########
# Determine the columns for: alternative ids, the observation ids and the choice
##########
# The 'custom_alt_id' is the name of a column to be created in the long-format data
# It will identify the alternative associated with each row.
custom_alt_id = "mode_id"

# Create a custom id column that ignores the fact that this is a 
# panel/repeated-observations dataset. Note the +1 ensures the id's start at one.
obs_id_column = "custom_id"
wide_swiss_metro[obs_id_column] = np.arange(wide_swiss_metro.shape[0],
                                            dtype=int) + 1


# Create a variable recording the choice column
choice_column = "CHOICE"

In [6]:
# Perform the conversion to long-format
long_swiss_metro = pl.convert_wide_to_long(wide_swiss_metro, 
                                           ind_variables, 
                                           alt_varying_variables, 
                                           availability_variables, 
                                           obs_id_column, 
                                           choice_column,
                                           new_alt_id_name=custom_alt_id)
# Look at the resulting long-format dataframe
long_swiss_metro.head(10).T


Out[6]:
0 1 2 3 4 5 6 7 8 9
custom_id 1 1 1 2 2 2 3 3 3 4
mode_id 1 2 3 1 2 3 1 2 3 1
CHOICE 0 1 0 0 1 0 0 1 0 0
GROUP 2 2 2 2 2 2 2 2 2 2
SURVEY 0 0 0 0 0 0 0 0 0 0
SP 1 1 1 1 1 1 1 1 1 1
ID 1 1 1 1 1 1 1 1 1 1
PURPOSE 1 1 1 1 1 1 1 1 1 1
FIRST 0 0 0 0 0 0 0 0 0 0
TICKET 1 1 1 1 1 1 1 1 1 1
WHO 1 1 1 1 1 1 1 1 1 1
LUGGAGE 0 0 0 0 0 0 0 0 0 0
AGE 3 3 3 3 3 3 3 3 3 3
MALE 0 0 0 0 0 0 0 0 0 0
INCOME 2 2 2 2 2 2 2 2 2 2
GA 0 0 0 0 0 0 0 0 0 0
ORIGIN 2 2 2 2 2 2 2 2 2 2
DEST 1 1 1 1 1 1 1 1 1 1
seat_configuration 0 0 0 0 0 0 0 0 0 0
travel_time 112 63 117 103 60 117 130 67 117 103
headway 120 20 0 30 10 0 60 30 0 30
travel_cost 48 52 65 48 49 84 48 58 52 40

Perform desired variable creations and transformations

Before estimating a model, one needs to pre-compute all of the variables that one wants to use. This is different from the functionality of other packages such as mlogit or statsmodels that use formula strings to create new variables "on-the-fly." This is also somewhat different from Python Biogeme where new variables can be defined in the script but not actually created by the user before model estimation. pyLogit does not perform variable creation. It only estimates models using variables that already exist.

Below, we pre-compute the variables needed for this example's model:

  1. Travel time in hours instead of minutes.
  2. Travel cost in units of 0.01 CHF (swiss franks) instead of CHF, for ease of numeric optimization.
  3. Travel cost interacted with a variable that identifies individuals who own a season pass (and therefore have no marginal cost of traveling on the trip) or whose employer will pay for their commute/business trip.
  4. A dummy variable for traveling with a single piece of luggage.
  5. A dummy variable for traveling with multiple pieces of luggage.
  6. A dummy variable denoting whether an individual is traveling first class.
  7. A dummy variable indicating whether an individual took their survey on-board a train (since it is a-priori expected that these individuals are already willing to take a train or train-like service such as Swissmetro).

In [7]:
##########
# Create scaled variables so the estimated coefficients are of similar magnitudes
##########
# Scale the travel time column by 60 to convert raw units (minutes) to hours
long_swiss_metro["travel_time_hrs"] = long_swiss_metro["travel_time"] / 60.0

# Scale the headway column by 60 to convert raw units (minutes) to hours
long_swiss_metro["headway_hrs"] = long_swiss_metro["headway"] / 60.0

# Figure out who doesn't incur a marginal cost for the ticket
# This can be because he/she owns an annial season pass (GA == 1) 
# or because his/her employer pays for the ticket (WHO == 2).
# Note that all the other complexity in figuring out ticket costs
# have been accounted for except the GA pass (the annual season
# ticket). Make sure this dummy variable is only equal to 1 for
# the rows with the Train or Swissmetro
long_swiss_metro["free_ticket"] = (((long_swiss_metro["GA"] == 1) |
                                    (long_swiss_metro["WHO"] == 2)) &
                                   long_swiss_metro[custom_alt_id].isin([1,2])).astype(int)
# Scale the travel cost by 100 so estimated coefficients are of similar magnitude
# and acccount for ownership of a season pass
long_swiss_metro["travel_cost_hundreth"] = (long_swiss_metro["travel_cost"] *
                                            (long_swiss_metro["free_ticket"] == 0) /
                                            100.0)

##########
# Create various dummy variables to describe the choice context of a given
# invidual for each choice task.
##########
# Create a dummy variable for whether a person has a single piece of luggage
long_swiss_metro["single_luggage_piece"] = (long_swiss_metro["LUGGAGE"] == 1).astype(int)

# Create a dummy variable for whether a person has multiple pieces of luggage
long_swiss_metro["multiple_luggage_pieces"] = (long_swiss_metro["LUGGAGE"] == 3).astype(int)

# Create a dummy variable indicating that a person is NOT first class
long_swiss_metro["regular_class"] = 1 - long_swiss_metro["FIRST"]

# Create a dummy variable indicating that the survey was taken aboard a train
# Note that such passengers are a-priori imagined to be somewhat partial to train modes
long_swiss_metro["train_survey"] = 1 - long_swiss_metro["SURVEY"]

Create the model specification

The model specification being used in this example is the following: $$ \begin{aligned} V_{i, \textrm{Train}} &= \textrm{ASC Train} + \\ &\quad \beta _{ \textrm{tt_transit} } \textrm{Travel Time} _{ \textrm{Train}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{tc_train} } \textrm{Travel Cost}_{\textrm{Train}} * \left( GA == 0 \right) * 0.01 + \\ &\quad \beta _{ \textrm{headway_train} } \textrm{Headway} _{\textrm{Train}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{survey} } \left( \textrm{Train Survey} == 1 \right) \\ \\ V_{i, \textrm{Swissmetro}} &= \textrm{ASC Swissmetro} + \\ &\quad \beta _{ \textrm{tt_transit} } \textrm{Travel Time} _{ \textrm{Swissmetro}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{tc_sm} } \textrm{Travel Cost}_{\textrm{Swissmetro}} * \left( GA == 0 \right) * 0.01 + \\ &\quad \beta _{ \textrm{headway_sm} } \textrm{Heaway} _{\textrm{Swissmetro}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{seat} } \left( \textrm{Seat Configuration} == 1 \right) \\ &\quad \beta _{ \textrm{survey} } \left( \textrm{Train Survey} == 1 \right) \\ &\quad \beta _{ \textrm{first_class} } \left( \textrm{First Class} == 0 \right) \\ \\ V_{i, \textrm{Car}} &= \beta _{ \textrm{tt_car} } \textrm{Travel Time} _{ \textrm{Car}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{tc_car}} \textrm{Travel Cost}_{\textrm{Car}} * 0.01 + \\ &\quad \beta _{\textrm{luggage}=1} \left( \textrm{Luggage} == 1 \right) + \\ &\quad \beta _{\textrm{luggage}>1} \left( \textrm{Luggage} > 1 \right) \end{aligned} $$

Note that packages such as mlogit and statsmodels do not, by default, handle coefficients that vary over some alternatives but not all, such as the travel time coefficient that is specified as being the same for "Train" and "Swissmetro" but different for "Car."


In [8]:
# NOTE: - Specification and variable names must be ordered dictionaries.
#       - Keys should be variables within the long format dataframe.
#         The sole exception to this is the "intercept" key.
#       - For the specification dictionary, the values should be lists
#         of integers or or lists of lists of integers. Within a list, 
#         or within the inner-most list, the integers should be the 
#         alternative ID's of the alternative whose utility specification 
#         the explanatory variable is entering. Lists of lists denote 
#         alternatives that will share a common coefficient for the variable
#         in question.

basic_specification = OrderedDict()
basic_names = OrderedDict()

basic_specification["intercept"] = [1, 2]
basic_names["intercept"] = ['ASC Train',
                            'ASC Swissmetro']

basic_specification["travel_time_hrs"] = [[1, 2,], 3]
basic_names["travel_time_hrs"] = ['Travel Time, units:hrs (Train and Swissmetro)',
                                  'Travel Time, units:hrs (Car)']

basic_specification["travel_cost_hundreth"] = [1, 2, 3]
basic_names["travel_cost_hundreth"] = ['Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Train)',
                                       'Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Swissmetro)',
                                       'Travel Cost, units: 0.01 CHF (Car)']

basic_specification["headway_hrs"] = [1, 2]
basic_names["headway_hrs"] = ["Headway, units:hrs, (Train)",
                              "Headway, units:hrs, (Swissmetro)"]

basic_specification["seat_configuration"] = [2]
basic_names["seat_configuration"] = ['Airline Seat Configuration, base=No (Swissmetro)']

basic_specification["train_survey"] = [[1, 2]]
basic_names["train_survey"] = ["Surveyed on a Train, base=No, (Train and Swissmetro)"]

basic_specification["regular_class"] = [1]
basic_names["regular_class"] = ["First Class == False, (Swissmetro)"]

basic_specification["single_luggage_piece"] = [3]
basic_names["single_luggage_piece"] = ["Number of Luggage Pieces == 1, (Car)"]

basic_specification["multiple_luggage_pieces"] = [3]
basic_names["multiple_luggage_pieces"] = ["Number of Luggage Pieces > 1, (Car)"]

Estimate the conditional logit model


In [9]:
# Estimate the multinomial logit model (MNL)
swissmetro_mnl = pl.create_choice_model(data=long_swiss_metro,
                                        alt_id_col=custom_alt_id,
                                        obs_id_col=obs_id_column,
                                        choice_col=choice_column,
                                        specification=basic_specification,
                                        model_type="MNL",
                                        names=basic_names)

# Specify the initial values and method for the optimization.
swissmetro_mnl.fit_mle(np.zeros(14))

# Look at the estimation results
swissmetro_mnl.get_statsmodels_summary()


Log-likelihood at zero: -6,964.6630
Initial Log-likelihood: -6,964.6630
Estimation Time: 0.09 seconds.
Final log-likelihood: -5,159.2583
/Users/timothyb0912/anaconda/lib/python2.7/site-packages/scipy/optimize/_minimize.py:382: RuntimeWarning: Method BFGS does not use Hessian information (hess).
  RuntimeWarning)
Out[9]:
Multinomial Logit Model Regression Results
Dep. Variable: CHOICE No. Observations: 6,768
Model: Multinomial Logit Model Df Residuals: 6,754
Method: MLE Df Model: 14
Date: Mon, 21 Mar 2016 Pseudo R-squ.: 0.259
Time: 22:57:13 Pseudo R-bar-squ.: 0.257
converged: True Log-Likelihood: -5,159.258
LL-Null: -6,964.663
coef std err z P>|z| [95.0% Conf. Int.]
ASC Train -1.2929 0.146 -8.845 0.000 -1.579 -1.006
ASC Swissmetro -0.5026 0.116 -4.332 0.000 -0.730 -0.275
Travel Time, units:hrs (Train and Swissmetro) -0.6990 0.042 -16.545 0.000 -0.782 -0.616
Travel Time, units:hrs (Car) -0.7230 0.047 -15.340 0.000 -0.815 -0.631
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Train) -0.5618 0.094 -6.002 0.000 -0.745 -0.378
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Swissmetro) -0.2817 0.045 -6.252 0.000 -0.370 -0.193
Travel Cost, units: 0.01 CHF (Car) -0.5139 0.104 -4.953 0.000 -0.717 -0.311
Headway, units:hrs, (Train) -0.3143 0.062 -5.063 0.000 -0.436 -0.193
Headway, units:hrs, (Swissmetro) -0.3773 0.196 -1.925 0.054 -0.761 0.007
Airline Seat Configuration, base=No (Swissmetro) -0.7825 0.087 -8.970 0.000 -0.953 -0.611
Surveyed on a Train, base=No, (Train and Swissmetro) 2.5425 0.114 22.235 0.000 2.318 2.767
First Class == False, (Swissmetro) 0.5650 0.077 7.305 0.000 0.413 0.717
Number of Luggage Pieces == 1, (Car) 0.4228 0.067 6.270 0.000 0.291 0.555
Number of Luggage Pieces > 1, (Car) 1.4141 0.259 5.461 0.000 0.907 1.922

View results without using statsmodels summary table

You can view all of the results simply by using print_summaries(). This will simply print the various summary dataframes.


In [10]:
# Look at other all results at the same time
swissmetro_mnl.print_summaries()



Number of Parameters                                         14
Number of Observations                                     6768
Null Log-Likelihood                                   -6964.663
Fitted Log-Likelihood                                 -5159.258
Rho-Squared                                           0.2592236
Rho-Bar-Squared                                       0.2572134
Estimation Message        Optimization terminated successfully.
dtype: object
==============================
                                                    parameters   std_err  \
ASC Train                                            -1.292943  0.146184   
ASC Swissmetro                                       -0.502595  0.116010   
Travel Time, units:hrs (Train and Swissmetro)        -0.699029  0.042250   
Travel Time, units:hrs (Car)                         -0.722998  0.047130   
Travel Cost * (Annual Pass == 0), units: 0.01 C...   -0.561769  0.093593   
Travel Cost * (Annual Pass == 0), units: 0.01 C...   -0.281683  0.045058   
Travel Cost, units: 0.01 CHF (Car)                   -0.513867  0.103745   
Headway, units:hrs, (Train)                          -0.314336  0.062085   
Headway, units:hrs, (Swissmetro)                     -0.377324  0.195969   
Airline Seat Configuration, base=No (Swissmetro)     -0.782455  0.087232   
Surveyed on a Train, base=No, (Train and Swissm...    2.542475  0.114347   
First Class == False, (Swissmetro)                    0.565015  0.077341   
Number of Luggage Pieces == 1, (Car)                  0.422767  0.067424   
Number of Luggage Pieces > 1, (Car)                   1.414052  0.258917   

                                                      t_stats       p_values  \
ASC Train                                           -8.844625   9.183619e-19   
ASC Swissmetro                                      -4.332359   1.475204e-05   
Travel Time, units:hrs (Train and Swissmetro)      -16.545035   1.738624e-61   
Travel Time, units:hrs (Car)                       -15.340414   4.105670e-53   
Travel Cost * (Annual Pass == 0), units: 0.01 C...  -6.002282   1.945635e-09   
Travel Cost * (Annual Pass == 0), units: 0.01 C...  -6.251572   4.063404e-10   
Travel Cost, units: 0.01 CHF (Car)                  -4.953161   7.301745e-07   
Headway, units:hrs, (Train)                         -5.063008   4.126931e-07   
Headway, units:hrs, (Swissmetro)                    -1.925430   5.417557e-02   
Airline Seat Configuration, base=No (Swissmetro)    -8.969775   2.971216e-19   
Surveyed on a Train, base=No, (Train and Swissm...  22.234822  1.581947e-109   
First Class == False, (Swissmetro)                   7.305494   2.762506e-13   
Number of Luggage Pieces == 1, (Car)                 6.270276   3.604088e-10   
Number of Luggage Pieces > 1, (Car)                  5.461402   4.723880e-08   

                                                    robust_std_err  \
ASC Train                                                 0.302543   
ASC Swissmetro                                            0.392238   
Travel Time, units:hrs (Train and Swissmetro)             0.146476   
Travel Time, units:hrs (Car)                              0.164374   
Travel Cost * (Annual Pass == 0), units: 0.01 C...        0.128883   
Travel Cost * (Annual Pass == 0), units: 0.01 C...        0.066505   
Travel Cost, units: 0.01 CHF (Car)                        0.230016   
Headway, units:hrs, (Train)                               0.061189   
Headway, units:hrs, (Swissmetro)                          0.206538   
Airline Seat Configuration, base=No (Swissmetro)          0.097108   
Surveyed on a Train, base=No, (Train and Swissm...        0.351394   
First Class == False, (Swissmetro)                        0.078165   
Number of Luggage Pieces == 1, (Car)                      0.156215   
Number of Luggage Pieces > 1, (Car)                       0.493739   

                                                    robust_t_stats  \
ASC Train                                                -4.273586   
ASC Swissmetro                                           -1.281352   
Travel Time, units:hrs (Train and Swissmetro)            -4.772303   
Travel Time, units:hrs (Car)                             -4.398497   
Travel Cost * (Annual Pass == 0), units: 0.01 C...       -4.358752   
Travel Cost * (Annual Pass == 0), units: 0.01 C...       -4.235516   
Travel Cost, units: 0.01 CHF (Car)                       -2.234049   
Headway, units:hrs, (Train)                              -5.137165   
Headway, units:hrs, (Swissmetro)                         -1.826904   
Airline Seat Configuration, base=No (Swissmetro)         -8.057613   
Surveyed on a Train, base=No, (Train and Swissm...        7.235402   
First Class == False, (Swissmetro)                        7.228484   
Number of Luggage Pieces == 1, (Car)                      2.706309   
Number of Luggage Pieces > 1, (Car)                       2.863968   

                                                    robust_p_values  
ASC Train                                              1.923545e-05  
ASC Swissmetro                                         2.000702e-01  
Travel Time, units:hrs (Train and Swissmetro)          1.821311e-06  
Travel Time, units:hrs (Car)                           1.090030e-05  
Travel Cost * (Annual Pass == 0), units: 0.01 C...     1.308061e-05  
Travel Cost * (Annual Pass == 0), units: 0.01 C...     2.280274e-05  
Travel Cost, units: 0.01 CHF (Car)                     2.547984e-02  
Headway, units:hrs, (Train)                            2.789139e-07  
Headway, units:hrs, (Swissmetro)                       6.771420e-02  
Airline Seat Configuration, base=No (Swissmetro)       7.779883e-16  
Surveyed on a Train, base=No, (Train and Swissm...     4.641533e-13  
First Class == False, (Swissmetro)                     4.884147e-13  
Number of Luggage Pieces == 1, (Car)                   6.803565e-03  
Number of Luggage Pieces > 1, (Car)                    4.183697e-03  

In [11]:
# Look at the general and goodness of fit statistics
swissmetro_mnl.fit_summary


Out[11]:
Number of Parameters                                         14
Number of Observations                                     6768
Null Log-Likelihood                                   -6964.663
Fitted Log-Likelihood                                 -5159.258
Rho-Squared                                           0.2592236
Rho-Bar-Squared                                       0.2572134
Estimation Message        Optimization terminated successfully.
dtype: object

In [12]:
# Look at the parameter estimation results, and round the results for easy viewing
np.round(swissmetro_mnl.summary, 3)


Out[12]:
parameters std_err t_stats p_values robust_std_err robust_t_stats robust_p_values
ASC Train -1.293 0.146 -8.845 0.000 0.303 -4.274 0.000
ASC Swissmetro -0.503 0.116 -4.332 0.000 0.392 -1.281 0.200
Travel Time, units:hrs (Train and Swissmetro) -0.699 0.042 -16.545 0.000 0.146 -4.772 0.000
Travel Time, units:hrs (Car) -0.723 0.047 -15.340 0.000 0.164 -4.398 0.000
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Train) -0.562 0.094 -6.002 0.000 0.129 -4.359 0.000
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Swissmetro) -0.282 0.045 -6.252 0.000 0.067 -4.236 0.000
Travel Cost, units: 0.01 CHF (Car) -0.514 0.104 -4.953 0.000 0.230 -2.234 0.025
Headway, units:hrs, (Train) -0.314 0.062 -5.063 0.000 0.061 -5.137 0.000
Headway, units:hrs, (Swissmetro) -0.377 0.196 -1.925 0.054 0.207 -1.827 0.068
Airline Seat Configuration, base=No (Swissmetro) -0.782 0.087 -8.970 0.000 0.097 -8.058 0.000
Surveyed on a Train, base=No, (Train and Swissmetro) 2.542 0.114 22.235 0.000 0.351 7.235 0.000
First Class == False, (Swissmetro) 0.565 0.077 7.305 0.000 0.078 7.228 0.000
Number of Luggage Pieces == 1, (Car) 0.423 0.067 6.270 0.000 0.156 2.706 0.007
Number of Luggage Pieces > 1, (Car) 1.414 0.259 5.461 0.000 0.494 2.864 0.004