Basic Tutorial for `pysal.spaghetti`



In [ ]:

    
import os
last_modified = None
if os.name == "posix":
    last_modified = !stat -f\
                    "# This notebook was last updated: %Sm"\
                     Network_Usage.ipynb
elif os.name == "nt":
    last_modified = !for %a in (Network_Usage.ipynb)\
                    do echo # This notebook was last updated: %~ta
    
if last_modified:
    get_ipython().set_next_input(last_modified[-1])



In [ ]:

    
# This notebook was last updated: May 13 20:21:51 2019



In [ ]:

    
# pysal submodule imports
from pysal.lib import examples
from pysal.explore import spaghetti as spgh
from pysal.explore import esda

import numpy as np
import matplotlib.pyplot as plt

import time

%matplotlib inline

__author__ = "James Gaboardi <jgaboardi@gmail.com>"

Instantiate a network



In [ ]:

    
ntw = spgh.Network(in_data=examples.get_path('streets.shp'))

Snap point patterns to the network



In [ ]:

    
# Crimes
ntw.snapobservations(examples.get_path('crimes.shp'),
                     'crimes',
                     attribute=True)
# Schools
ntw.snapobservations(examples.get_path('schools.shp'),
                     'schools',
                     attribute=False)

A network is composed of a single topological representation of roads and $n$ point patterns which are snapped to the network.



In [ ]:

    
ntw.pointpatterns

Attributes for every point pattern

dist_snapped dict keyed by pointid with the value as snapped distance from observation to network arc
dist_to_vertex dict keyed by pointid with the value being a dict in the form
```
 {node: distance to vertex, node: distance to vertex}
```
npoints point observations in set
obs_to_arc dict keyed by arc with the value being a dict in the form
```
 {pointID:(x-coord, y-coord), pointID:(x-coord, y-coord), ... }
```
obs_to_vertex list of incident network vertices to snapped observation points
points geojson like representation of the point pattern. Includes properties if read with attributes=True
snapped_coordinates dict keyed by pointid with the value being (x-coord, y-coord)

Counts per link (arc or edge) are important, but should not be precomputed since we have different representations of the network (spatial and graph currently). (Relatively) Uniform segmentation still needs to be done.



In [ ]:

    
counts = ntw.count_per_link(ntw.pointpatterns['crimes'].obs_to_arc,
                            graph=False)



In [ ]:

    
sum(list(counts.values())) / float(len(counts.keys()))

Network segmentation



In [ ]:

    
n200 = ntw.split_arcs(200.0)



In [ ]:

    
counts = n200.count_per_link(n200.pointpatterns['crimes'].obs_to_arc,
                             graph=False)
sum(counts.values()) / float(len(counts.keys()))

Create `geopandas.GeoDataFrame` objects of the vertices and arcs



In [ ]:

    
# 'full' unsegmented network
vertices_df, arcs_df = spgh.element_as_gdf(ntw,
                                           vertices=ntw.vertex_coords,
                                           arcs=ntw.arcs)

# network segmented at 200-meter increments
vertices200_df, arcs200_df = spgh.element_as_gdf(n200,
                                                 vertices=n200.vertex_coords,
                                                 arcs=n200.arcs)

Visualization of the shapefile derived, unsegmented network with vertices in a larger, blue, semi-opaque form and the distance segmented network with small, red, fully opaque vertices.



In [ ]:

    
base = arcs_df.plot(color='k', alpha=.25, figsize=(12,12))
vertices_df.plot(ax=base, color='b', markersize=300, alpha=.25)
arcs200_df.plot(ax=base, color='k', alpha=.25)
vertices200_df.plot(ax=base, color='r', markersize=25, alpha=1.)

Moran's I using the digitized network



In [ ]:

    
# Binary Adjacency
w = ntw.contiguityweights(graph=False)

# Build the y vector
arcs = w.neighbors.keys()
y = np.zeros(len(arcs))

for i, a in enumerate(arcs):
    if a in counts.keys():
        y[i] = counts[a]

# Moran's I
res = esda.moran.Moran(y,
                       w,
                       permutations=99)
print(dir(res))

Moran's I using the graph representation to generate the W

Note that we have to regenerate the counts per arc, since the graph will have less edges.



In [ ]:

    
counts = ntw.count_per_link(ntw.pointpatterns['crimes'].obs_to_arc,
                            graph=True)

# Binary Adjacency
w = ntw.contiguityweights(graph=True)

# Build the y vector
edges = w.neighbors.keys()
y = np.zeros(len(edges))

for i, e in enumerate(edges):
    if e in counts.keys():
        y[i] = counts[e]

# Moran's I
res = esda.moran.Moran(y,
                       w,
                       permutations=99)

print(dir(res))

Moran's I using the segmented network and intensities instead of counts



In [ ]:

    
# Binary Adjacency
w = n200.contiguityweights(graph=False)

# Compute the counts
counts = n200.count_per_link(n200.pointpatterns['crimes'].obs_to_arc,
                             graph=False)

# Build the y vector and convert from raw counts to intensities
arcs = w.neighbors.keys()
y = np.zeros(len(arcs))

for i, a in enumerate(edges):
    if a in counts.keys():
        length = n200.arc_lengths[a]
        y[i] = counts[a] / length

# Moran's I
res = esda.moran.Moran(y,
                       w,
                       permutations=99)

print(dir(res))

Timings for distance based methods, e.g. G-function



In [ ]:

    
t1 = time.time()
n0 = ntw.allneighbordistances(ntw.pointpatterns['crimes'])
print(time.time()-t1)



In [ ]:

    
t1 = time.time()
n1 = n200.allneighbordistances(n200.pointpatterns['crimes'])
print(time.time()-t1)

Note that the first time these methods are called, the underlying vertex-to-vertex shortest path distance matrix has to be calculated. Subsequent calls will not require this, and will be much faster:



In [ ]:

    
t1 = time.time()
n0 = ntw.allneighbordistances(ntw.pointpatterns['crimes'])
print(time.time()-t1)



In [ ]:

    
t1 = time.time()
n1 = n200.allneighbordistances(n200.pointpatterns['crimes'])
print(time.time()-t1)

Simulate a point pattern on the network

Need to supply a count of the number of points and a distirbution (default is uniform). Generally, this will not be called by the user, since the simulation will be used for Monte Carlo permutation.



In [ ]:

    
npts = ntw.pointpatterns['crimes'].npoints
sim = ntw.simulate_observations(npts)
sim

F-function



In [ ]:

    
fres = ntw.NetworkF(ntw.pointpatterns['crimes'],
                    permutations=99)



In [ ]:

    
plt.figure(figsize=(8,8))
plt.plot(fres.xaxis, fres.observed, 'b-', linewidth=1.5, label='Observed')
plt.plot(fres.xaxis, fres.upperenvelope, 'r--', label='Upper')
plt.plot(fres.xaxis, fres.lowerenvelope, 'k--', label='Lower')
plt.legend(loc='best', fontsize='x-large')
plt.title('Network F Function', fontsize='xx-large')
plt.show()

Create a nearest neighbor matrix using the crimes point pattern

[note from jlaura] Right now, both the G and K functions generate a full distance matrix. This is because, I know that the full generation is correct and I believe that the truncated generated, e.g. nearest neighbor, has a bug.

G-function



In [ ]:

    
gres = ntw.NetworkG(ntw.pointpatterns['crimes'],
                    permutations=99)



In [ ]:

    
plt.figure(figsize=(8,8))
plt.plot(gres.xaxis, gres.observed, 'b-', linewidth=1.5, label='Observed')
plt.plot(gres.xaxis, gres.upperenvelope, 'r--', label='Upper')
plt.plot(gres.xaxis, gres.lowerenvelope, 'k--', label='Lower')
plt.legend(loc='best', fontsize='x-large')
plt.title('Network G Function', fontsize='xx-large')
plt.show()

K-function



In [ ]:

    
kres = ntw.NetworkK(ntw.pointpatterns['crimes'],
                    permutations=99)



In [ ]:

    
plt.figure(figsize=(8,8))
plt.plot(kres.xaxis, kres.observed, 'b-', linewidth=1.5, label='Observed')
plt.plot(kres.xaxis, kres.upperenvelope, 'r--', label='Upper')
plt.plot(kres.xaxis, kres.lowerenvelope, 'k--', label='Lower')
plt.legend(loc='best', fontsize='x-large')
plt.title('Network K Function', fontsize='xx-large')
plt.show()

Basic Tutorial for pysal.spaghetti

Instantiate a network

Snap point patterns to the network

A network is composed of a single topological representation of roads and $n$ point patterns which are snapped to the network.

Attributes for every point pattern

Counts per link (arc or edge) are important, but should not be precomputed since we have different representations of the network (spatial and graph currently). (Relatively) Uniform segmentation still needs to be done.

Network segmentation

Create geopandas.GeoDataFrame objects of the vertices and arcs

Visualization of the shapefile derived, unsegmented network with vertices in a larger, blue, semi-opaque form and the distance segmented network with small, red, fully opaque vertices.

Moran's I using the digitized network

Moran's I using the graph representation to generate the W

Moran's I using the segmented network and intensities instead of counts

Timings for distance based methods, e.g. G-function

Simulate a point pattern on the network

F-function

Create a nearest neighbor matrix using the crimes point pattern

G-function

K-function

Basic Tutorial for `pysal.spaghetti`

Create `geopandas.GeoDataFrame` objects of the vertices and arcs