302: Itinerary Choice using Simple Nested Logit



In [1]:

    
import larch

This example is an itinerary choice model built using the example itinerary choice dataset included with Larch. See example 300 for details.



In [2]:

    
from larch.examples import example
d = example(300, 'd')









    



converting data_co to <class 'numpy.float64'>
converting data_ce to <class 'numpy.float64'>
rescaled array of weights by a factor of 2239.980952380952

We will be building a nested logit model, but in order to do so we need to rationalize the alternative numbers. As given, our raw itinerary choice data has a lot of alternatives, but they are not ordered or numbered in a regular way; each elemental alternative has an arbitrary code number assigned to it, and the code numbers for one case are not comparable to another case. We need to renumber the alternatives in a manner that is more suited for our application, such that based on the code number we can programatically extract a the relevant features of the alternative that we will want to use in building our nested logit model. In this example we want to test a model which has nests based on level of service. To renumber, first we will define the relevant categories and values, and establish a numbering system using a special object:



In [3]:

    
d1 = d.new_systematic_alternatives(
    groupby='nb_cnxs',
    name='alternative_code',
    padding_levels=4,
    groupby_prefixes=['Cnx'],
    overwrite=False,
    complete_features_list={'nb_cnxs':[0,1,2]},
)









    



converting data_ce to <class 'numpy.float64'>

If we compare the new data with the old data, we'll see that we have created a few more alternative.



In [4]:

    
d.info()









    



larch.DataFrames:
  n_cases: 105
  n_alts: 127
  data_ce:
    - nb_cnxs
    - elapsed_time
    - fare_hy
    - fare_ly
    - equipment
    - carrier
    - timeperiod
  data_co:
    - traveler
    - origin
    - destination
  data_av: <populated>
  data_ch: choice
  data_wt: <populated>



In [5]:

    
d1.info()









    



larch.DataFrames:
  n_cases: 105
  n_alts: 134
  data_ce:
    - id_alt
    - nb_cnxs
    - elapsed_time
    - fare_hy
    - fare_ly
    - equipment
    - carrier
    - timeperiod
  data_co:
    - traveler
    - origin
    - destination
  data_av: <populated>
  data_ch: <populated>
  data_wt: <populated>

Now let's make our model. The utility function we will use is the same as the one we used for the MNL version of the model.



In [6]:

    
m = larch.Model(dataservice=d1)

v = [
    "timeperiod==2",
    "timeperiod==3",
    "timeperiod==4",
    "timeperiod==5",
    "timeperiod==6",
    "timeperiod==7",
    "timeperiod==8",
    "timeperiod==9",
    "carrier==2",
    "carrier==3",
    "carrier==4",
    "carrier==5",
    "equipment==2",
    "fare_hy",
    "fare_ly",    
    "elapsed_time",  
    "nb_cnxs",       
]
from larch.roles import PX
m.utility_ca = sum(PX(i) for i in v)

m.choice_ca_var = 'choice'

If we just end our model specification here, we will have a plain MNL model. To change to a nested logit model, all we need to do is add the nests. We can do this easily, using the special magic_nesting method, that uses the structure of the data that we defined above.



In [7]:

    
m.magic_nesting()



In [8]:

    
m.load_data()









    



req_data does not request weight_co but it is set and being provided



In [9]:

    
m.maximize_loglike()









    




Iteration 009 [Converged] 






    




LL = -347.19303042325504






    







  
    
      
      value
      initvalue
      nullvalue
      minimum
      maximum
      holdfast
      note
      best
    
  
  
    
      MU_nb_cnxs
      0.691112
      1.0
      1.0
      0.001000
      1.000000
      0
      
      0.691112
    
    
      carrier==2
      0.079526
      0.0
      0.0
      -inf
      inf
      0
      
      0.079526
    
    
      carrier==3
      0.440481
      0.0
      0.0
      -inf
      inf
      0
      
      0.440481
    
    
      carrier==4
      0.396793
      0.0
      0.0
      -inf
      inf
      0
      
      0.396793
    
    
      carrier==5
      -0.439080
      0.0
      0.0
      -inf
      inf
      0
      
      -0.439080
    
    
      elapsed_time
      -0.004233
      0.0
      0.0
      -inf
      inf
      0
      
      -0.004233
    
    
      equipment==2
      0.326877
      0.0
      0.0
      -inf
      inf
      0
      
      0.326877
    
    
      fare_hy
      -0.000847
      0.0
      0.0
      -inf
      inf
      0
      
      -0.000847
    
    
      fare_ly
      -0.000856
      0.0
      0.0
      -inf
      inf
      0
      
      -0.000856
    
    
      nb_cnxs
      -3.155549
      0.0
      0.0
      -inf
      inf
      0
      
      -3.155549
    
    
      timeperiod==2
      0.065527
      0.0
      0.0
      -inf
      inf
      0
      
      0.065527
    
    
      timeperiod==3
      0.088094
      0.0
      0.0
      -inf
      inf
      0
      
      0.088094
    
    
      timeperiod==4
      0.042914
      0.0
      0.0
      -inf
      inf
      0
      
      0.042914
    
    
      timeperiod==5
      0.096519
      0.0
      0.0
      -inf
      inf
      0
      
      0.096519
    
    
      timeperiod==6
      0.164687
      0.0
      0.0
      -inf
      inf
      0
      
      0.164687
    
    
      timeperiod==7
      0.243887
      0.0
      0.0
      -inf
      inf
      0
      
      0.243887
    
    
      timeperiod==8
      0.245135
      0.0
      0.0
      -inf
      inf
      0
      
      0.245135
    
    
      timeperiod==9
      -0.005913
      0.0
      0.0
      -inf
      inf
      0
      
      -0.005913
    
  








    Out[9]:





┣          loglike: -347.19303042325504
┣                x: MU_nb_cnxs       0.691112
┃                   carrier==2       0.079526
┃                   carrier==3       0.440481
┃                   carrier==4       0.396793
┃                   carrier==5      -0.439080
┃                   elapsed_time    -0.004233
┃                   equipment==2     0.326877
┃                   fare_hy         -0.000847
┃                   fare_ly         -0.000856
┃                   nb_cnxs         -3.155549
┃                   timeperiod==2    0.065527
┃                   timeperiod==3    0.088094
┃                   timeperiod==4    0.042914
┃                   timeperiod==5    0.096519
┃                   timeperiod==6    0.164687
┃                   timeperiod==7    0.243887
┃                   timeperiod==8    0.245135
┃                   timeperiod==9   -0.005913
┃                   dtype: float64
┣        tolerance: 3.8832811170766046e-06
┣            steps: array([1., 1., 1., 1., 1., 1., 1., 1., 1.])
┣          message: 'Optimization terminated successfully.'
┣     elapsed_time: datetime.timedelta(microseconds=61784)
┣           method: 'bhhh'
┣          n_cases: 105
┣ iteration_number: 9
┣          logloss: 3.306600289745286



In [ ]:

	value	initvalue	nullvalue	minimum	maximum	best
MU_nb_cnxs	0.691112	1.0	1.0	0.001000	1.000000	0.691112
carrier==2	0.079526	0.0	0.0	-inf	inf	0.079526
carrier==3	0.440481	0.0	0.0	-inf	inf	0.440481
carrier==4	0.396793	0.0	0.0	-inf	inf	0.396793
carrier==5	-0.439080	0.0	0.0	-inf	inf	-0.439080
elapsed_time	-0.004233	0.0	0.0	-inf	inf	-0.004233
equipment==2	0.326877	0.0	0.0	-inf	inf	0.326877
fare_hy	-0.000847	0.0	0.0	-inf	inf	-0.000847
fare_ly	-0.000856	0.0	0.0	-inf	inf	-0.000856
nb_cnxs	-3.155549	0.0	0.0	-inf	inf	-3.155549
timeperiod==2	0.065527	0.0	0.0	-inf	inf	0.065527
timeperiod==3	0.088094	0.0	0.0	-inf	inf	0.088094
timeperiod==4	0.042914	0.0	0.0	-inf	inf	0.042914
timeperiod==5	0.096519	0.0	0.0	-inf	inf	0.096519
timeperiod==6	0.164687	0.0	0.0	-inf	inf	0.164687
timeperiod==7	0.243887	0.0	0.0	-inf	inf	0.243887
timeperiod==8	0.245135	0.0	0.0	-inf	inf	0.245135
timeperiod==9	-0.005913	0.0	0.0	-inf	inf	-0.005913