If any part of this notebook is used in your research, please cite with the reference found in README.md.

Network-constrained spatial autocorrelation

Performing and visualizing exploratory spatial data analysis

Author: James D. Gaboardi jgaboardi@gmail.com

This notebook is an advanced walk-through for:

Demonstrating spatial autocorrelation with pysal/esda
Calculating Moran's I on a segmented network
Visualizing spatial autocorrelation with pysal/splot



In [1]:

    
%load_ext watermark
%watermark









    



2020-05-02T21:43:38-04:00

CPython 3.7.3
IPython 7.10.2

compiler   : Clang 9.0.0 (tags/RELEASE_900/final)
system     : Darwin
release    : 19.4.0
machine    : x86_64
processor  : i386
CPU cores  : 4
interpreter: 64bit



In [2]:

    
import esda
import libpysal
import matplotlib
import matplotlib_scalebar
from matplotlib_scalebar.scalebar import ScaleBar
import numpy
import spaghetti
import splot

%matplotlib inline
%watermark -w
%watermark -iv









    



watermark 2.0.2
numpy               1.18.1
esda                2.2.1
libpysal            4.2.2
matplotlib          3.1.2
matplotlib_scalebar 0.6.1
spaghetti           1.5.0.rc0
splot               1.1.2



In [3]:

    
try:
    from IPython.display import set_matplotlib_formats
    set_matplotlib_formats("retina")
except ImportError:
    pass

Instantiating a `spaghetti.Network` object and a point pattern

Instantiate the network from a `.shp` file



In [4]:

    
ntw = spaghetti.Network(in_data=libpysal.examples.get_path("streets.shp"))
ntw









    Out[4]:





<spaghetti.network.Network at 0x1201fe160>

Extract network arcs as a `geopandas.GeoDataFrame`



In [5]:

    
_, arc_df = spaghetti.element_as_gdf(ntw, vertices=True, arcs=True)
arc_df.head()









    Out[5]:







  
    
      
      id
      geometry
      comp_label
    
  
  
    
      0
      (0, 1)
      LINESTRING (728368.048 877125.895, 728368.139 ...
      0
    
    
      1
      (0, 2)
      LINESTRING (728368.048 877125.895, 728367.458 ...
      0
    
    
      2
      (1, 110)
      LINESTRING (728368.139 877023.272, 728612.255 ...
      0
    
    
      3
      (1, 127)
      LINESTRING (728368.139 877023.272, 727708.140 ...
      0
    
    
      4
      (1, 213)
      LINESTRING (728368.139 877023.272, 728368.729 ...
      0

Associate the network with a point pattern



In [6]:

    
pp_name = "crimes"
pp_shp = libpysal.examples.get_path("%s.shp" % pp_name)
ntw.snapobservations(pp_shp, pp_name, attribute=True)
ntw.pointpatterns









    Out[6]:





{'crimes': <spaghetti.network.PointPattern at 0x120c1af28>}

Extract the crimes point pattern as a `geopandas.GeoDataFrame`



In [7]:

    
pp_df = spaghetti.element_as_gdf(ntw, pp_name=pp_name)
pp_df.head()









    Out[7]:







  
    
      
      id
      geometry
      comp_label
    
  
  
    
      0
      0
      POINT (727913.000 875721.000)
      0
    
    
      1
      1
      POINT (724812.000 875763.000)
      0
    
    
      2
      2
      POINT (727391.000 875853.000)
      0
    
    
      3
      3
      POINT (728017.000 875858.000)
      0
    
    
      4
      4
      POINT (727525.000 875860.000)
      0

1. ESDA — Exploratory Spatial Data Analysis with pysal/esda

The Moran's I test statistic allows for the inference of how clustered (or dispersed) a dataset is while considering both attribute values and spatial relationships. A value of closer to +1 indicates absolute clustering while a value of closer to -1 indicates absolute dispersion. Complete spatial randomness takes the value of 0. See the esda documentation for in-depth descriptions and tutorials.



In [8]:

    
def calc_moran(net, pp_name, w):
    """Calculate a Moran's I statistic based on network arcs."""
    # Compute the counts
    pointpat = net.pointpatterns[pp_name]
    counts = net.count_per_link(pointpat.obs_to_arc, graph=False)
    # Build the y vector
    arcs = w.neighbors.keys()
    y = [counts[a] if a in counts.keys() else 0. for i, a in enumerate(arcs)]
    # Moran's I
    moran = esda.moran.Moran(y, w, permutations=99)
    return moran, y

Moran's I using the network representation's W



In [9]:

    
moran_ntwwn, yaxis_ntwwn = calc_moran(ntw, pp_name, ntw.w_network)
moran_ntwwn.I









    Out[9]:





0.005192687496078421

Moran's I using the graph representation's W



In [10]:

    
moran_ntwwg, yaxis_ntwwg = calc_moran(ntw, pp_name, ntw.w_graph)
moran_ntwwg.I









    Out[10]:





0.05223210335368553

Interpretation:

Although both the network and graph representations (moran_ntwwn and moran_ntwwg, respectively) display minimal postive spatial autocorrelation, a slighly higher value is observed in the graph represention. This is likely due to more direct connectivity in the graph representation; a direct result of eliminating degree-2 vertices). The Moran's I for both the network and graph representations suggest that network arcs/graph edges attributed with associated crime counts are nearly randomly distributed.

2. Moran's I on a segmented network

Moran's I on a network split into 200-meter segments



In [11]:

    
n200 = ntw.split_arcs(200.0)
n200









    Out[11]:





<spaghetti.network.Network at 0x120cbb9e8>



In [12]:

    
moran_n200, yaxis_n200 = calc_moran(n200, pp_name, n200.w_network)
moran_n200.I









    Out[12]:





-0.01764461487556588

Moran's I on a network split into 50-meter segments



In [13]:

    
n50 = ntw.split_arcs(50.0)
n50









    Out[13]:





<spaghetti.network.Network at 0x120d51cf8>



In [14]:

    
moran_n50, yaxis_n50 = calc_moran(n50, pp_name, n50.w_network)
moran_n50.I









    Out[14]:





-0.012505858739644651

Interpretation:

Contrary to above, both the 200-meter and 50-meter segmented networks (moran_n200 and moran_n50, respectively) display minimal negative spatial autocorrelation, with slighly lower values being observed in the 200-meter representation. However, similar to above the Moran's I for both the these representations suggest that network arcs attributed with associated crime counts are nearly randomly distributed.

3. Visualizing ESDA with `splot`

Here we are demonstrating spatial lag, which refers to attribute similarity. See the splot documentation for in-depth descriptions and tutorials.



In [15]:

    
from splot.esda import moran_scatterplot, lisa_cluster, plot_moran

Moran scatterplot

Plotted with equal aspect



In [16]:

    
moran_scatterplot(moran_ntwwn, aspect_equal=True);

Plotted without equal aspect



In [17]:

    
moran_scatterplot(moran_ntwwn, aspect_equal=False);

This scatterplot demostrates the attribute values and associated attribute similarities in space (spatial lag) for the network representation's W (moran_ntwwn).

Reference distribution and Moran scatterplot



In [18]:

    
plot_moran(moran_ntwwn, zstandard=True, figsize=(10,4));

This figure incorporates the reference distribution of Moran's I values into the above scatterplot of the network representation's W (moran_ntwwn).

Local Moran's l

The demonstrations above considered the dataset as a whole, providing a global measure. The following demostrates the consideration of local spatial autocorrelation, providing a measure for each observation. This is best interpreted visually, here with another scatterplot colored to indicate relationship type.

Plotted with equal aspect



In [19]:

    
p = 0.05
moran_loc_ntwwn = esda.moran.Moran_Local(yaxis_ntwwn, ntw.w_network)
fig, ax = moran_scatterplot(moran_loc_ntwwn, p=p, aspect_equal=True)
ax.set(xlabel="Crimes", ylabel="Spatial Lag of Crimes");

Plotted without equal aspect



In [20]:

    
fig, ax = moran_scatterplot(moran_loc_ntwwn, aspect_equal=False, p=p)
ax.set(xlabel="Crimes", ylabel="Spatial Lag of Crimes");

Interpretation:

The majority of observations (network arcs) display no significant local spatial autocorrelation (shown in gray).

Plotting Local Indicators of Spatial Autocorrelation (LISA)



In [21]:

    
f, ax = lisa_cluster(moran_loc_ntwwn, arc_df, p=p, figsize=(12,12), lw=5, zorder=0)
pp_df.plot(ax=ax, zorder=1, alpha=.25, color="g", markersize=30)
suptitle = "LISA for Crime-weighted Networks Arcs"
matplotlib.pyplot.suptitle(suptitle, fontsize=20, x=.51, y=.93)
subtitle = "Crimes ($n=%s$) are represented as semi-opaque green circles"
matplotlib.pyplot.title(subtitle % pp_df.shape[0], fontsize=15);

	id	geometry
0	(0, 1)	LINESTRING (728368.048 877125.895, 728368.139 ...
1	(0, 2)	LINESTRING (728368.048 877125.895, 728367.458 ...
2	(1, 110)	LINESTRING (728368.139 877023.272, 728612.255 ...
3	(1, 127)	LINESTRING (728368.139 877023.272, 727708.140 ...
4	(1, 213)	LINESTRING (728368.139 877023.272, 728368.729 ...

	id	geometry
0	0	POINT (727913.000 875721.000)
1	1	POINT (724812.000 875763.000)
2	2	POINT (727391.000 875853.000)
3	3	POINT (728017.000 875858.000)
4	4	POINT (727525.000 875860.000)

Network-constrained spatial autocorrelation

Performing and visualizing exploratory spatial data analysis

Instantiating a spaghetti.Network object and a point pattern

Instantiate the network from a .shp file

Extract network arcs as a geopandas.GeoDataFrame

Associate the network with a point pattern

Extract the crimes point pattern as a geopandas.GeoDataFrame

1. ESDA — Exploratory Spatial Data Analysis with pysal/esda

Moran's I using the network representation's W

Moran's I using the graph representation's W

2. Moran's I on a segmented network

Moran's I on a network split into 200-meter segments

Moran's I on a network split into 50-meter segments

3. Visualizing ESDA with splot

Moran scatterplot

Reference distribution and Moran scatterplot

Local Moran's l

Plotting Local Indicators of Spatial Autocorrelation (LISA)

Instantiating a `spaghetti.Network` object and a point pattern

Instantiate the network from a `.shp` file

Extract network arcs as a `geopandas.GeoDataFrame`

Extract the crimes point pattern as a `geopandas.GeoDataFrame`

3. Visualizing ESDA with `splot`