pyLogit Example

The purpose of this notebook is to demonstrate they key functionalities of pyLogit:

  1. Estimating conditional logit-type models such as the four built-in asymmetric models.
    1. Multinomial Asymmetric Logit Model
    2. Multinomial Uneven Logit Model
    3. Multinomial Scobit Model
    4. Multinomial Clog-log Model

For description and formulation of the aforementioned asymmetric choice models, see

Brathwaite, Timothy, and Joan Walker. "Asymmetric, Closed-Form, Finite-Parameter Models of Multinomial Choice." arXiv preprint arXiv:1606.05900 (2016). http://arxiv.org/abs/1606.05900.
The dataset being used for this example is the "Swissmetro" dataset used in the Python Biogeme examples. The data can be downloaded at http://biogeme.epfl.ch/examples_swissmetro.html, and a detailed explanation of the variables and data-collection procedure can be found at http://www.strc.ch/conferences/2001/bierlaire1.pdf.


In [1]:
from collections import OrderedDict    # For recording the model specification 

import pandas as pd                    # For file input/output
import numpy as np                     # For vectorized math operations

import pylogit as pl                   # For MNL model estimation and
                                       # conversion from wide to long format

Load the cleaned, long-format Swiss Metro data

For details of how the dataset was cleaned and processed, see the "Main PyLogit Example" notebook.


In [2]:
# Load the dataset
long_swiss_metro = pd.read_csv("../data/long_swiss_metro_data.csv")

In [3]:
# Look at the first 5 rows of the data
long_swiss_metro.head().T


Out[3]:
0 1 2 3 4
custom_id 1.000000 1.000000 1.00 2.000000 2.000000
mode_id 1.000000 2.000000 3.00 1.000000 2.000000
CHOICE 0.000000 1.000000 0.00 0.000000 1.000000
GROUP 2.000000 2.000000 2.00 2.000000 2.000000
SURVEY 0.000000 0.000000 0.00 0.000000 0.000000
SP 1.000000 1.000000 1.00 1.000000 1.000000
ID 1.000000 1.000000 1.00 1.000000 1.000000
PURPOSE 1.000000 1.000000 1.00 1.000000 1.000000
FIRST 0.000000 0.000000 0.00 0.000000 0.000000
TICKET 1.000000 1.000000 1.00 1.000000 1.000000
WHO 1.000000 1.000000 1.00 1.000000 1.000000
LUGGAGE 0.000000 0.000000 0.00 0.000000 0.000000
AGE 3.000000 3.000000 3.00 3.000000 3.000000
MALE 0.000000 0.000000 0.00 0.000000 0.000000
INCOME 2.000000 2.000000 2.00 2.000000 2.000000
GA 0.000000 0.000000 0.00 0.000000 0.000000
ORIGIN 2.000000 2.000000 2.00 2.000000 2.000000
DEST 1.000000 1.000000 1.00 1.000000 1.000000
seat_configuration 0.000000 0.000000 0.00 0.000000 0.000000
travel_time 112.000000 63.000000 117.00 103.000000 60.000000
headway 120.000000 20.000000 0.00 30.000000 10.000000
travel_cost 48.000000 52.000000 65.00 48.000000 49.000000
travel_time_hrs 1.866667 1.050000 1.95 1.716667 1.000000
headway_hrs 2.000000 0.333333 0.00 0.500000 0.166667
free_ticket 0.000000 0.000000 0.00 0.000000 0.000000
travel_cost_hundreth 0.480000 0.520000 0.65 0.480000 0.490000
single_luggage_piece 0.000000 0.000000 0.00 0.000000 0.000000
multiple_luggage_pieces 0.000000 0.000000 0.00 0.000000 0.000000
regular_class 1.000000 1.000000 1.00 1.000000 1.000000
train_survey 1.000000 1.000000 1.00 1.000000 1.000000

Create the model specification

The model specification being used in this example is the following: $$ \begin{aligned} V_{i, \textrm{Train}} &= \textrm{ASC Train} + \\ &\quad \beta _{ \textrm{tt_transit} } \textrm{Travel Time} _{ \textrm{Train}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{tc_train} } \textrm{Travel Cost}_{\textrm{Train}} * \left( GA == 0 \right) * 0.01 + \\ &\quad \beta _{ \textrm{headway_train} } \textrm{Headway} _{\textrm{Train}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{survey} } \left( \textrm{Train Survey} == 1 \right) \\ \\ V_{i, \textrm{Swissmetro}} &= \textrm{ASC Swissmetro} + \\ &\quad \beta _{ \textrm{tt_transit} } \textrm{Travel Time} _{ \textrm{Swissmetro}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{tc_sm} } \textrm{Travel Cost}_{\textrm{Swissmetro}} * \left( GA == 0 \right) * 0.01 + \\ &\quad \beta _{ \textrm{headway_sm} } \textrm{Heaway} _{\textrm{Swissmetro}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{seat} } \left( \textrm{Seat Configuration} == 1 \right) \\ &\quad \beta _{ \textrm{survey} } \left( \textrm{Train Survey} == 1 \right) \\ &\quad \beta _{ \textrm{first_class} } \left( \textrm{First Class} == 0 \right) \\ \\ V_{i, \textrm{Car}} &= \beta _{ \textrm{tt_car} } \textrm{Travel Time} _{ \textrm{Car}} * \frac{1}{60} + \\ &\quad \beta _{ \textrm{tc_car}} \textrm{Travel Cost}_{\textrm{Car}} * 0.01 + \\ &\quad \beta _{\textrm{luggage}=1} \left( \textrm{Luggage} == 1 \right) + \\ &\quad \beta _{\textrm{luggage}>1} \left( \textrm{Luggage} > 1 \right) \end{aligned} $$

Note that packages such as mlogit and statsmodels do not, by default, handle coefficients that vary over some alternatives but not all, e.g. the travel time coefficient that is specified as being the same for "Train" and "Swissmetro" but different for "Car."


In [4]:
# NOTE: - Specification and variable names must be ordered dictionaries.
#       - Keys should be variables within the long format dataframe.
#         The sole exception to this is the "intercept" key.
#       - For the specification dictionary, the values should be lists
#         of integers or or lists of lists of integers. Within a list, 
#         or within the inner-most list, the integers should be the 
#         alternative ID's of the alternative whose utility specification 
#         the explanatory variable is entering. Lists of lists denote 
#         alternatives that will share a common coefficient for the variable
#         in question.

basic_specification = OrderedDict()
basic_names = OrderedDict()

basic_specification["intercept"] = [1, 2]
basic_names["intercept"] = ['ASC Train',
                            'ASC Swissmetro']

basic_specification["travel_time_hrs"] = [[1, 2,], 3]
basic_names["travel_time_hrs"] = ['Travel Time, units:hrs (Train and Swissmetro)',
                                  'Travel Time, units:hrs (Car)']

basic_specification["travel_cost_hundreth"] = [1, 2, 3]
basic_names["travel_cost_hundreth"] = ['Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Train)',
                                       'Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Swissmetro)',
                                       'Travel Cost, units: 0.01 CHF (Car)']

basic_specification["headway_hrs"] = [1, 2]
basic_names["headway_hrs"] = ["Headway, units:hrs, (Train)",
                              "Headway, units:hrs, (Swissmetro)"]

basic_specification["seat_configuration"] = [2]
basic_names["seat_configuration"] = ['Airline Seat Configuration, base=No (Swissmetro)']

basic_specification["train_survey"] = [[1, 2]]
basic_names["train_survey"] = ["Surveyed on a Train, base=No, (Train and Swissmetro)"]

basic_specification["regular_class"] = [1]
basic_names["regular_class"] = ["First Class == False, (Swissmetro)"]

basic_specification["single_luggage_piece"] = [3]
basic_names["single_luggage_piece"] = ["Number of Luggage Pieces == 1, (Car)"]

basic_specification["multiple_luggage_pieces"] = [3]
basic_names["multiple_luggage_pieces"] = ["Number of Luggage Pieces > 1, (Car)"]

Estimate the conditional logit model, upon which the other models are based


In [5]:
# The 'custom_alt_id' is the name of a column to be created in the long-format data
# It will identify the alternative associated with each row.
custom_alt_id = "mode_id"

# Create a custom id column that ignores the fact that this is a 
# panel/repeated-observations dataset. Note the +1 ensures the id's start at one.
obs_id_column = "custom_id"

# Create a variable recording the choice column
choice_column = "CHOICE"

In [6]:
# Estimate the multinomial logit model (MNL)
swissmetro_mnl = pl.create_choice_model(data=long_swiss_metro,
                                        alt_id_col=custom_alt_id,
                                        obs_id_col=obs_id_column,
                                        choice_col=choice_column,
                                        specification=basic_specification,
                                        model_type="MNL",
                                        names=basic_names)

# Specify the initial values and method for the optimization.
swissmetro_mnl.fit_mle(np.zeros(14))

# Look at the estimation results
swissmetro_mnl.get_statsmodels_summary()


Log-likelihood at zero: -6,964.6630
Initial Log-likelihood: -6,964.6630
Estimation Time: 0.14 seconds.
Final log-likelihood: -5,159.2583
/Users/timothyb0912/anaconda/lib/python2.7/site-packages/scipy/optimize/_minimize.py:382: RuntimeWarning: Method BFGS does not use Hessian information (hess).
  RuntimeWarning)
Out[6]:
Multinomial Logit Model Regression Results
Dep. Variable: CHOICE No. Observations: 6,768
Model: Multinomial Logit Model Df Residuals: 6,754
Method: MLE Df Model: 14
Date: Thu, 15 Sep 2016 Pseudo R-squ.: 0.259
Time: 14:27:59 Pseudo R-bar-squ.: 0.257
converged: False Log-Likelihood: -5,159.258
LL-Null: -6,964.663
coef std err z P>|z| [95.0% Conf. Int.]
ASC Train -1.2929 0.146 -8.845 0.000 -1.579 -1.006
ASC Swissmetro -0.5026 0.116 -4.332 0.000 -0.730 -0.275
Travel Time, units:hrs (Train and Swissmetro) -0.6990 0.042 -16.545 0.000 -0.782 -0.616
Travel Time, units:hrs (Car) -0.7230 0.047 -15.340 0.000 -0.815 -0.631
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Train) -0.5618 0.094 -6.002 0.000 -0.745 -0.378
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Swissmetro) -0.2817 0.045 -6.252 0.000 -0.370 -0.193
Travel Cost, units: 0.01 CHF (Car) -0.5139 0.104 -4.953 0.000 -0.717 -0.311
Headway, units:hrs, (Train) -0.3143 0.062 -5.063 0.000 -0.436 -0.193
Headway, units:hrs, (Swissmetro) -0.3773 0.196 -1.925 0.054 -0.761 0.007
Airline Seat Configuration, base=No (Swissmetro) -0.7825 0.087 -8.970 0.000 -0.953 -0.611
Surveyed on a Train, base=No, (Train and Swissmetro) 2.5425 0.114 22.235 0.000 2.318 2.767
First Class == False, (Swissmetro) 0.5650 0.077 7.305 0.000 0.413 0.717
Number of Luggage Pieces == 1, (Car) 0.4228 0.067 6.270 0.000 0.291 0.555
Number of Luggage Pieces > 1, (Car) 1.4141 0.259 5.461 0.000 0.907 1.922

Test and demonstrate how to estimate different asymmetric, 'logit-type' models

Note that in the models estimated below, we make use of PyLogit's ability to estimate "outside" intercept values. This means that while the index, $V_{ij} = X_{ij} \beta$ (where $j$ is an alternative), will be subjected to some transformation, the intercept (i.e. the Alternative Specifc Constant) will not be subject to this transformation. This is done by removing the intercept from the specification dictionaries and by using the init_intercepts and init_coefs keyword argument in the fit_mle() function of the various choice models. See below for examples.

Multinomial Asymmetric Logit


In [7]:
# Create the various specification and name dictionaries 
# for the asymmetric logit model. It is the same as for the MNL
# except that it excludes the intercept from V_{i, alternative}.
asym_specification = OrderedDict()
asym_names = OrderedDict()

for col in basic_specification:
    if col != "intercept":
        asym_specification[col] = basic_specification[col]
        asym_names[col] = basic_names[col]

# Get the list of intercept names for the asymmetric logit model
# This is used to tell PyLogit that we will be using 'outside'
# intercept parameters.
# See equation 2 of http://arxiv.org/abs/1606.05900 for more
# details
asym_intercept_names = basic_names["intercept"]
# Specify what alternative is not having its intercept estimated
asym_intercept_ref_pos = 2

# Give names to the shape parameters of the asymmetric logit model
# Note that all of the shape parameters are not identifiable so we
# need to restrict one of them. This accounts for why we do not have
# a "shape_car" name.
asym_shape_names = ["shape_train", "shape_swiss_metro"]

# Note the index of the alternative whose shape parameter is constrained
# (i.e. the Car alternative)
asym_ref = 2

swissmetro_asym = pl.create_choice_model(data=long_swiss_metro,
                                         alt_id_col=custom_alt_id,
                                         obs_id_col=obs_id_column,
                                         choice_col=choice_column,
                                         specification=asym_specification,
                                         model_type="Asym",
                                         names=asym_names,
                                         shape_names=asym_shape_names,
                                         intercept_names=asym_intercept_names,
                                         shape_ref_pos=asym_ref,
                                         intercept_ref_pos=asym_intercept_ref_pos)

# Also, note that we use None for initial values and use kwargs to
# specify our initial values for the optimization. This is necessary
# for the use of 'outside' intercept parameters.

# Note that dividing the index coefficients by log(3) accounts for
# the fact that when each shape parameter is 1/3, the value of the
# asymmetric logit model's estimated index coefficients are equal to
# the logit model's estimates, divided by log(3)

swissmetro_asym.fit_mle(None,
                        init_shapes=np.zeros(2),
                        init_intercepts=swissmetro_mnl.params.values[:2],
                        init_coefs=swissmetro_mnl.params.values[2:] / np.log(3),
                        method="bfgs")

# Look at the model results
swissmetro_asym.get_statsmodels_summary()


Log-likelihood at zero: -6,964.6630
Initial Log-likelihood: -5,159.2583
/Users/timothyb0912/anaconda/lib/python2.7/site-packages/scipy/optimize/_minimize.py:382: RuntimeWarning: Method bfgs does not use Hessian information (hess).
  RuntimeWarning)
Estimation Time: 0.58 seconds.
Final log-likelihood: -4,990.9261
Out[7]:
Multinomial Asymmetric Logit Model Regression Results
Dep. Variable: CHOICE No. Observations: 6,768
Model: Multinomial Asymmetric Logit Model Df Residuals: 6,752
Method: MLE Df Model: 16
Date: Thu, 15 Sep 2016 Pseudo R-squ.: 0.283
Time: 14:28:25 Pseudo R-bar-squ.: 0.281
converged: False Log-Likelihood: -4,990.926
LL-Null: -6,964.663
coef std err z P>|z| [95.0% Conf. Int.]
shape_train 2.2006 1.207 1.824 0.068 -0.165 4.566
shape_swiss_metro 2.8647 1.166 2.456 0.014 0.579 5.150
ASC Train -5.0652 1.164 -4.350 0.000 -7.347 -2.783
ASC Swissmetro -3.3396 1.150 -2.903 0.004 -5.594 -1.085
Travel Time, units:hrs (Train and Swissmetro) -0.2787 0.025 -11.040 0.000 -0.328 -0.229
Travel Time, units:hrs (Car) -0.8638 0.081 -10.666 0.000 -1.022 -0.705
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Train) -0.3497 0.081 -4.297 0.000 -0.509 -0.190
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Swissmetro) -0.1501 0.029 -5.149 0.000 -0.207 -0.093
Travel Cost, units: 0.01 CHF (Car) -0.5074 0.142 -3.562 0.000 -0.787 -0.228
Headway, units:hrs, (Train) -0.3425 0.061 -5.614 0.000 -0.462 -0.223
Headway, units:hrs, (Swissmetro) -0.0759 0.136 -0.559 0.576 -0.342 0.190
Airline Seat Configuration, base=No (Swissmetro) -0.4591 0.197 -2.336 0.020 -0.844 -0.074
Surveyed on a Train, base=No, (Train and Swissmetro) 3.8486 0.210 18.320 0.000 3.437 4.260
First Class == False, (Swissmetro) 0.1400 0.076 1.833 0.067 -0.010 0.290
Number of Luggage Pieces == 1, (Car) 0.6175 0.093 6.617 0.000 0.435 0.800
Number of Luggage Pieces > 1, (Car) 1.5466 0.146 10.609 0.000 1.261 1.832

Multinomial Uneven Logit Model


In [8]:
# Create the various specification and name dictionaries 
# for the uneven logit model. It is the same as for the MNL
# except that it excludes the intercept from V_{i, alternative}.
uneven_specification = OrderedDict()
uneven_names = OrderedDict()

for col in basic_specification:
    if col != "intercept":
        uneven_specification[col] = basic_specification[col]
        uneven_names[col] = basic_names[col]

# Get the list of intercept names for the uneven logit model
# This is used to tell PyLogit that we will be using 'outside'
# intercept parameters.
# See equation 2 of http://arxiv.org/abs/1606.05900 for more
# details
uneven_intercept_names = basic_names["intercept"]
# Specify what alternative is not having its intercept estimated
uneven_intercept_ref_pos = 2

# Specify the names of the uneven logit model's shape parameters
# Note that we include "shape_car" because all of the uneven logit
# model's shape parameters are identifiable.
uneven_shape_names = ["shape_train", "shape_swiss_metro", "shape_car"]

swissmetro_uneven = pl.create_choice_model(data=long_swiss_metro,
                                           alt_id_col=custom_alt_id,
                                           obs_id_col=obs_id_column,
                                           choice_col=choice_column,
                                           specification=uneven_specification,
                                           model_type="Uneven",
                                           names=uneven_names,
                                           shape_names=uneven_shape_names,
                                           intercept_names=uneven_intercept_names,
                                           intercept_ref_pos=uneven_intercept_ref_pos)


# Also, note that we use None for initial values and use kwargs to
# specify our initial values for the optimaztion. This is necessary
# to use 'outside' intercept parameters with the model.
swissmetro_uneven.fit_mle(None,
                          init_shapes=np.zeros(3),
                          init_intercepts=swissmetro_mnl.params.values[:2],
                          init_coefs=swissmetro_mnl.params.values[2:],
                          method="bfgs")

# Look at the model results
swissmetro_uneven.get_statsmodels_summary()


Log-likelihood at zero: -6,964.6630
Initial Log-likelihood: -5,159.2583
Estimation Time: 0.41 seconds.
Final log-likelihood: -4,954.6676
Out[8]:
Multinomial Uneven Logit Model Regression Results
Dep. Variable: CHOICE No. Observations: 6,768
Model: Multinomial Uneven Logit Model Df Residuals: 6,751
Method: MLE Df Model: 17
Date: Thu, 15 Sep 2016 Pseudo R-squ.: 0.289
Time: 14:29:17 Pseudo R-bar-squ.: 0.286
converged: True Log-Likelihood: -4,954.668
LL-Null: -6,964.663
coef std err z P>|z| [95.0% Conf. Int.]
shape_train 0.9611 0.112 8.613 0.000 0.742 1.180
shape_swiss_metro 0.1615 0.184 0.880 0.379 -0.198 0.521
shape_car -0.2853 0.490 -0.582 0.561 -1.246 0.675
ASC Train -1.0788 0.202 -5.351 0.000 -1.474 -0.684
ASC Swissmetro -0.6369 0.201 -3.167 0.002 -1.031 -0.243
Travel Time, units:hrs (Train and Swissmetro) -0.3867 0.046 -8.424 0.000 -0.477 -0.297
Travel Time, units:hrs (Car) -0.8798 0.470 -1.871 0.061 -1.801 0.042
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Train) -0.2607 0.061 -4.260 0.000 -0.381 -0.141
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Swissmetro) -0.2098 0.048 -4.341 0.000 -0.304 -0.115
Travel Cost, units: 0.01 CHF (Car) -0.5817 0.332 -1.750 0.080 -1.233 0.070
Headway, units:hrs, (Train) -0.2666 0.046 -5.750 0.000 -0.358 -0.176
Headway, units:hrs, (Swissmetro) -0.3769 0.178 -2.120 0.034 -0.725 -0.029
Airline Seat Configuration, base=No (Swissmetro) -0.1950 0.090 -2.171 0.030 -0.371 -0.019
Surveyed on a Train, base=No, (Train and Swissmetro) 2.1639 0.200 10.808 0.000 1.771 2.556
First Class == False, (Swissmetro) 0.1449 0.055 2.616 0.009 0.036 0.253
Number of Luggage Pieces == 1, (Car) 0.6274 0.339 1.849 0.064 -0.038 1.292
Number of Luggage Pieces > 1, (Car) 2.0020 0.831 2.409 0.016 0.373 3.631

Multinomial Scobit Model


In [9]:
# Create the various specification and name dictionaries 
# for the scobit model. It is the same as for the MNL
# except that it excludes the intercept from V_{i, alternative}.
scobit_specification = OrderedDict()
scobit_names = OrderedDict()

for col in basic_specification:
    if col != "intercept":
        scobit_specification[col] = basic_specification[col]
        scobit_names[col] = basic_names[col]

# Get the list of intercept names for the scobit model
# This is used to tell PyLogit that we will be using 'outside'
# intercept parameters.
# See equation 2 of http://arxiv.org/abs/1606.05900 for more
# details
scobit_intercept_names = basic_names["intercept"]

# Create the names of the shape parameters that are needed for the scobit model
scobit_shape_names = ["shape_train", "shape_swiss_metro", "shape_car"]

# Specify which intercept/ASC is not being estimated (namely, the Car intercept)
scobit_intercept_ref = 2

#Estimate the model
swissmetro_scobit = pl.create_choice_model(data=long_swiss_metro,
                                           alt_id_col=custom_alt_id,
                                           obs_id_col=obs_id_column,
                                           choice_col=choice_column,
                                           specification=scobit_specification,
                                           model_type="Scobit",
                                           names=scobit_names,
                                           shape_names=scobit_shape_names,
                                           intercept_ref_pos=scobit_intercept_ref,
                                           intercept_names=scobit_intercept_names)

# Note that we are using 'outside' intercept parameters for this model
swissmetro_scobit.fit_mle(None,
                          init_shapes=np.zeros(3),
                          init_intercepts=swissmetro_mnl.params.values[:2],
                          init_coefs=swissmetro_mnl.params.values[2:],
                          method="BFGS",
                          maxiter=1200)

# Look at the model results
swissmetro_scobit.get_statsmodels_summary()


Log-likelihood at zero: -6,964.6630
Initial Log-likelihood: -5,159.2583
Estimation Time: 0.60 seconds.
Final log-likelihood: -4,952.9253
Out[9]:
Multinomial Scobit Model Regression Results
Dep. Variable: CHOICE No. Observations: 6,768
Model: Multinomial Scobit Model Df Residuals: 6,751
Method: MLE Df Model: 17
Date: Thu, 15 Sep 2016 Pseudo R-squ.: 0.289
Time: 14:30:27 Pseudo R-bar-squ.: 0.286
converged: False Log-Likelihood: -4,952.925
LL-Null: -6,964.663
coef std err z P>|z| [95.0% Conf. Int.]
shape_train 0.8033 0.117 6.874 0.000 0.574 1.032
shape_swiss_metro -0.9112 0.489 -1.862 0.063 -1.871 0.048
shape_car -1.4079 0.347 -4.052 0.000 -2.089 -0.727
ASC Train 1.0793 0.271 3.978 0.000 0.548 1.611
ASC Swissmetro -0.7150 0.569 -1.257 0.209 -1.830 0.400
Travel Time, units:hrs (Train and Swissmetro) -0.5046 0.059 -8.493 0.000 -0.621 -0.388
Travel Time, units:hrs (Car) -1.7342 0.510 -3.399 0.001 -2.734 -0.734
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Train) -0.3617 0.078 -4.624 0.000 -0.515 -0.208
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Swissmetro) -0.3141 0.069 -4.542 0.000 -0.450 -0.179
Travel Cost, units: 0.01 CHF (Car) -1.0225 0.380 -2.689 0.007 -1.768 -0.277
Headway, units:hrs, (Train) -0.2919 0.051 -5.703 0.000 -0.392 -0.192
Headway, units:hrs, (Swissmetro) -0.3824 0.237 -1.611 0.107 -0.848 0.083
Airline Seat Configuration, base=No (Swissmetro) -0.2021 0.095 -2.124 0.034 -0.389 -0.016
Surveyed on a Train, base=No, (Train and Swissmetro) 2.7869 0.206 13.526 0.000 2.383 3.191
First Class == False, (Swissmetro) 0.1346 0.064 2.117 0.034 0.010 0.259
Number of Luggage Pieces == 1, (Car) 1.1631 0.327 3.555 0.000 0.522 1.804
Number of Luggage Pieces > 1, (Car) 3.0167 0.627 4.808 0.000 1.787 4.246

Multinomial Clog-log Model


In [10]:
# Create the specification and name dictionaries 
# for the clog-log model.It is the same as for the MNL
# except that it excludes the intercept from V_{i, alternative}.
clog_specification = OrderedDict()
clog_names = OrderedDict()

# Copy the specification dictionary from the logit
# model, without the intercept parameters so we
# can place the intercept parameters outside the
# index
for col in basic_specification:
    if col != "intercept":
        clog_specification[col] = basic_specification[col]
        clog_names[col] = basic_names[col]

# Get the list of intercept names for the clog-log model
# This is used to tell PyLogit that we will be using 'outside'
# intercept parameters.
# See equation 2 of http://arxiv.org/abs/1606.05900 for more
# details
clog_intercept_names = basic_names["intercept"]

# Specify which intercept/ASC is not being estimated 
# (i.e, the Car intercept)
clog_intercept_ref = 2

# Estimate the Clog-log model based on the MNL model
# Note "intercept_ref_pos=2" means we are not estimating
# an intercept for the alternative whose alternative id
# falls at position 2 in the sorted array of alternative ids
swissmetro_clog = pl.create_choice_model(data=long_swiss_metro,
                                         alt_id_col=custom_alt_id,
                                         obs_id_col=obs_id_column,
                                         choice_col=choice_column,
                                         specification=clog_specification,
                                         model_type="Cloglog",
                                         intercept_ref_pos=clog_intercept_ref,
                                         names=clog_names,
                                         intercept_names=clog_intercept_names)

# Estimate the clog log model. Note we don't pass one single array of
# initial values but instead pass keyword arguments
swissmetro_clog.fit_mle(None,
                        init_intercepts=swissmetro_mnl.params.values[:2],
                        init_coefs=swissmetro_mnl.params.values[2:],
                        method='bfgs')

# Look at the model results
swissmetro_clog.get_statsmodels_summary()


Log-likelihood at zero: -6,964.6630
Initial Log-likelihood: -6,055.5603
Estimation Time: 0.69 seconds.
Final log-likelihood: -5,191.9526
Out[10]:
Multinomial Clog-log Model Regression Results
Dep. Variable: CHOICE No. Observations: 6,768
Model: Multinomial Clog-log Model Df Residuals: 6,754
Method: MLE Df Model: 14
Date: Thu, 15 Sep 2016 Pseudo R-squ.: 0.255
Time: 14:30:54 Pseudo R-bar-squ.: 0.253
converged: False Log-Likelihood: -5,191.953
LL-Null: -6,964.663
coef std err z P>|z| [95.0% Conf. Int.]
ASC Train -3.0252 0.137 -22.020 0.000 -3.294 -2.756
ASC Swissmetro -0.4904 0.123 -3.983 0.000 -0.732 -0.249
Travel Time, units:hrs (Train and Swissmetro) -0.1304 0.019 -6.866 0.000 -0.168 -0.093
Travel Time, units:hrs (Car) -0.4964 0.042 -11.889 0.000 -0.578 -0.415
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Train) -0.1982 0.041 -4.864 0.000 -0.278 -0.118
Travel Cost * (Annual Pass == 0), units: 0.01 CHF (Swissmetro) -0.2011 0.033 -6.181 0.000 -0.265 -0.137
Travel Cost, units: 0.01 CHF (Car) -0.2982 0.088 -3.407 0.001 -0.470 -0.127
Headway, units:hrs, (Train) 0.0198 0.017 1.140 0.254 -0.014 0.054
Headway, units:hrs, (Swissmetro) -1.0705 0.111 -9.658 0.000 -1.288 -0.853
Airline Seat Configuration, base=No (Swissmetro) -0.2546 0.039 -6.527 0.000 -0.331 -0.178
Surveyed on a Train, base=No, (Train and Swissmetro) 1.4715 0.042 35.285 0.000 1.390 1.553
First Class == False, (Swissmetro) 0.2147 0.026 8.320 0.000 0.164 0.265
Number of Luggage Pieces == 1, (Car) 0.4234 0.055 7.717 0.000 0.316 0.531
Number of Luggage Pieces > 1, (Car) 1.1684 0.149 7.816 0.000 0.875 1.461

Look at the log-likelihood values overall and how they changed for the various models


In [11]:
# Create a list of the estimated model objects
model_objs = [swissmetro_mnl,
              swissmetro_clog,
              swissmetro_asym,
              swissmetro_uneven,
              swissmetro_scobit]

log_likelihood_summary = pd.Series([x.log_likelihood for x in model_objs],
                                   index=[x.model_type for x in model_objs],
                                   name="Log-likelihood Summary")
log_likelihood_summary


Out[11]:
Multinomial Logit Model              -5159.258305
Multinomial Clog-log Model           -5191.952628
Multinomial Asymmetric Logit Model   -4990.926075
Multinomial Uneven Logit Model       -4954.667556
Multinomial Scobit Model             -4952.925303
Name: Log-likelihood Summary, dtype: float64

As shown above, for this dataset, one can improve (in terms of log-likelihood) on the estimates offered by the standard logit model by switching to an asymmetric choice model. However, you may sometimes pay a computatiaonal price in terms of ease of estimation since these choice models have non-concave log-likelihood functions. Additionally, you may pay a price in terms of variance-inflation of one's estimates beacause the "shape parameters" are not locally orthogonal to the index coefficients. See "Orthogonalizing Parametric Link Transformation Families in Binary Regression Analysis" by Claudia Czado and Thomas J. Santner, The Canadian Journal of Statistics, Vol. 20, No. 1 (Mar., 1992), pp. 51-61, http://www.jstor.org/stable/3315574.