If any part of this notebook is used in your research, please cite with the reference found in README.md.
spaghetti
Author: James D. Gaboardi jgaboardi@gmail.com
This notebook demonstrates the following limitations and caveats:
In [1]:
%load_ext watermark
%watermark
In [2]:
import geopandas
import libpysal
from libpysal.cg import Point, Chain
import matplotlib
import matplotlib.pyplot as plt
import spaghetti
%matplotlib inline
%watermark -w
%watermark -iv
In [3]:
try:
from IPython.display import set_matplotlib_formats
set_matplotlib_formats("retina")
except ImportError:
pass
In [4]:
def arc_labels(a, b, s, offset=[-0.025, 0.025]):
"""Label each network arc."""
def _lab_loc(_x):
"""Helper for labeling network arcs."""
xy = _x.geometry.interpolate(0.5, normalized=True).coords[0]
xy = tuple([_xy+o for (_xy,o) in zip(xy,offset)])
return xy
kws = {"ha":"right", "va":"bottom","weight":"bold","color":"k","size":s}
a.apply(lambda x: b.annotate(s=x.comp_label, xy=_lab_loc(x), **kws), axis=1)
def vert_labels(v, b, s, offset=[0.025, 0.025], col="comp_label", c="r"):
"""Label each network vertex."""
def _lab_loc(_x):
"""Internal helper for labeling vertices."""
xy = _x.geometry.coords[0]
xy = tuple([_xy+o for (_xy,o) in zip(xy,offset)])
return xy
kws = {"ha":"left", "va":"bottom","weight":"bold","color":c,"size":s}
v.apply(lambda x: b.annotate(s=x[col], xy=_lab_loc(x), **kws), axis=1)
In [5]:
lines = [
Chain([Point((2,7)), Point((2,3))]),
Chain([Point((2,3)), Point((0,3))]),
Chain([Point((0,3)), Point((0,5))]),
Chain([Point((0,5)), Point((5,5))]),
Chain([Point((5,5)), Point((5,7))]),
Chain([Point((5,7)), Point((3,7))]),
Chain([Point((3,7)), Point((3,3))])
]
lines
Out[5]:
In [6]:
ntw = spaghetti.Network(in_data=lines, extractgraph=False)
In [7]:
vertices_df, arcs_df = spaghetti.element_as_gdf(ntw, vertices=True, arcs=True)
vertices_df
Out[7]:
In [8]:
arcs_df
Out[8]:
In [9]:
base = arcs_df.plot(lw=3, color="k", alpha=.35, figsize=(9, 9))
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2)
# vertex labels
vert_labels(vertices_df, base, 14)
vert_labels(vertices_df, base, 14, offset=[-0.15, 0.025], col="id", c="k")
The plot above shows network arcs in gray with vertices in red. Also, vertices are labled in black and component membership is labeled in red. Since there are no actual intersections between segments (0,1),(3,4)
or (3,4),(6,7)
, there are no vertices. Therefore, a shortest path from vertex 0
to 7
must follow the route {1,2,3,4,5,6}
for a total of 21 units of distance.
In [10]:
ntw.full_distance_matrix(n_processes=1, gen_tree=True)
In [11]:
ntw.network_trees[0][7]
Out[11]:
In [12]:
ntw.distance_matrix[0][7]
Out[12]:
In [13]:
U = [Chain([Point((2,2)), Point((2,1)), Point((4,1)), Point((4,2))])]
I = [Chain([Point((3,0)), Point((3,1))])]
arcs = U + I
arcs
Out[13]:
In [14]:
ntw = spaghetti.Network(in_data=arcs)
In [15]:
ntw.network_n_components
Out[15]:
Component 0
includes 4 vertices (the "goal") and component 1
includes 2 (the "post").
In [16]:
ntw.network_component_vertices
Out[16]:
In [17]:
vertices_df, arcs_df = spaghetti.element_as_gdf(ntw, vertices=True, arcs=True)
vertices_df
Out[17]:
In [18]:
arcs_df
Out[18]:
In [19]:
base_kws = {"alpha":.5, "lw":5, "cmap":"Paired", "column":"comp_label"}
base = arcs_df.plot(figsize=(9, 9), **base_kws)
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2)
# vertex labels
vert_labels(vertices_df, base, 14)
As stated previously, flow from one network segment to another is only possible when segments share a vertex. In this case, no vertex is actually shared, although superficially there does appear to be. The plot above visualizes this concept by showing the two (dis)connected components of the "goalpost" network. Network arcs are colored by component membership and vertices are labeled by component membership. This is also demonstrated below when overlaying a plot of libpysal.weights.W. This is accomplished below with arcs also shown with component membership (black nodes connected with black edges and labeled in black).
In [20]:
base_kws = {"alpha":.5, "lw":5, "cmap":"Paired", "column":"comp_label"}
base = arcs_df.plot(figsize=(9, 9), zorder=0, **base_kws)
# plot keywords
node_kws, edge_kws = {"s":100, "zorder":2}, {"zorder":1}
w_kws = {"edge_kws":edge_kws, "node_kws":node_kws}
ntw.w_network.plot(arcs_df, indexed_on="id", ax=base, **w_kws)
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2)
# arc labels
arc_labels(arcs_df, base, 12)
# vertex labels
vert_labels(vertices_df, base, 14)
However, if the line segments were digitized "properly" a single connected component can be obtained (we emphasize "properly" because segments of a spatial network do not always intersect to form a single component in reality, for example highway over/underpases). In order to obtained a single connected component in this case we can simply add another point—(3,1)
—to the initial collection of lines prior to network instantiation.
In [21]:
U = [Chain([Point((2,2)), Point((2,1)), Point((3,1)), Point((4,1)), Point((4,2))])]
I = [Chain([Point((3,0)), Point((3,1))])]
arcs = U + I
ntw = spaghetti.Network(in_data=arcs)
print("Network components:\t", ntw.network_n_components)
print("Network vertices:\t", ntw.network_component_vertices)
vertices_df, arcs_df = spaghetti.element_as_gdf(ntw, vertices=True, arcs=True)
In [22]:
base_kws = {"alpha":.5, "lw":5, "cmap":"Paired", "column":"comp_label"}
base = arcs_df.plot(figsize=(9, 9), zorder=0, **base_kws)
# plot keywords
node_kws, edge_kws = {"s":100, "zorder":2}, {"zorder":1}
w_kws = {"edge_kws":edge_kws, "node_kws":node_kws}
ntw.w_network.plot(arcs_df, indexed_on="id", ax=base, **w_kws)
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2)
# arc labels
arc_labels(arcs_df, base, 12)
# vertex labels
vert_labels(vertices_df, base, 14)
Now both the network vertices and spatially-weighted arcs are all members of a single connected component.
In [23]:
libpysal.examples.explain("snow_maps")
In [24]:
soho = geopandas.read_file(libpysal.examples.get_path("Soho_Network.shp"))
soho = soho.to_crs(epsg=32630)
soho
Out[24]:
In [25]:
ntw = spaghetti.Network(in_data=soho)
In [26]:
ntw.network_n_components
Out[26]:
In [27]:
vertices_df, arcs_df = spaghetti.element_as_gdf(ntw, vertices=True, arcs=True)
base_kws = {"alpha":.5, "lw":5, "cmap":"Paired", "column":"comp_label"}
base = arcs_df.plot(figsize=(12, 12), **base_kws)
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2)
Out[27]:
In this dataset that are 45 components of the networks, indicating some serious issues with intended connectivity. When overlayed with arc spatial weights it becomes clear that many of the arcs that appear to (and should) intersect, actually do not.
In [28]:
base_kws = {"alpha":.5, "lw":5, "cmap":"Paired", "column":"comp_label"}
base = arcs_df.plot(figsize=(12, 12), zorder=0, **base_kws)
# plot keywords
node_kws, edge_kws = {"s":100, "zorder":2}, {"zorder":1}
w_kws = {"edge_kws":edge_kws, "node_kws":node_kws}
ntw.w_network.plot(arcs_df, indexed_on="id", ax=base, **w_kws)
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2);
Fuzzy contiguity can be acheived through libpysal.weights.fuzzy_contiguity(), which provides us with a single connected component, but does not capture the true nature of the network stucture, as demonstrated below.
In [29]:
fuzzy = libpysal.weights.fuzzy_contiguity(arcs_df)
fuzzy.remap_ids(arcs_df.id)
fuzzy.n_components
Out[29]:
In [30]:
base_kws = {"alpha":.5, "lw":5, "cmap":"Paired", "column":"comp_label"}
base = arcs_df.plot(figsize=(12, 12), zorder=0, **base_kws)
# plot keywords
node_kws, edge_kws = {"s":100, "zorder":2}, {"zorder":1}
w_kws = {"edge_kws":edge_kws, "node_kws":node_kws}
fuzzy.plot(arcs_df, indexed_on="id", ax=base, **w_kws)
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2);
The issue with the Soho_Network.shp dataset is not precision, it is that not all true intersections are demarcated with vertices, as in the "goalpost" example above. This is clearly seen in the upper two segments of the network exterior where each segment is a single object, not broken at joining segments. In order to properly use the Soho_Network.shp
dataset as a network, geometric preprocessing would need to be performed in order to extract all intersections.
Here the "I" of the goalpost has been digitized slightly short of the intersection with the "U" component.
In [31]:
U = [Chain([Point((2,2)), Point((2,1)), Point((3,1)), Point((4,1)), Point((4,2))])]
I = [Chain([Point((3,0)), Point((3,0.99999))])]
arcs = U + I
ntw = spaghetti.Network(in_data=arcs)
print("Network components:\t", ntw.network_n_components)
print("Network vertices:\t", ntw.network_component_vertices)
vertices_df, arcs_df = spaghetti.element_as_gdf(ntw, vertices=True, arcs=True)
In [32]:
base_kws = {"alpha":.5, "lw":5, "cmap":"Paired", "column":"comp_label"}
base = arcs_df.plot(figsize=(9, 9), zorder=0, **base_kws)
# plot keywords
node_kws, edge_kws = {"s":100, "zorder":2}, {"zorder":1}
w_kws = {"edge_kws":edge_kws, "node_kws":node_kws}
ntw.w_network.plot(arcs_df, indexed_on="id", ax=base, **w_kws)
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2)
plt.xlim(2.9999, 3.0001)
plt.ylim(0.9999, 1.0001);
A small gap is shown in the plot above resulting in two network components.
In [33]:
base_kws = {"alpha":.5, "lw":5, "cmap":"Paired", "column":"comp_label"}
base = arcs_df.plot(figsize=(9, 9), zorder=0, **base_kws)
# plot keywords
node_kws, edge_kws = {"s":100, "zorder":2}, {"zorder":1}
w_kws = {"edge_kws":edge_kws, "node_kws":node_kws}
ntw.w_network.plot(arcs_df, indexed_on="id", ax=base, **w_kws)
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2)
# arc labels
arc_labels(arcs_df, base, 12)
# vertex labels
vert_labels(vertices_df, base, 14)
In order to accomodate situations like these where network elements are digitized in generally the correct locations but the precision is lacking, the vertex_atol
parameter can be invoked when instantiating the network. This parameter allows for the rounding of decimal places (not significant digits — see vertex_sig), but should be used with caution.
In [34]:
ntw = spaghetti.Network(in_data=arcs, vertex_atol=3)
print("Network components:\t", ntw.network_n_components)
print("Network vertices:\t", ntw.network_component_vertices)
vertices_df, arcs_df = spaghetti.element_as_gdf(ntw, vertices=True, arcs=True)
In [35]:
base_kws = {"alpha":.5, "lw":5, "cmap":"Paired", "column":"comp_label"}
base = arcs_df.plot(figsize=(9, 9), zorder=0, **base_kws)
# plot keywords
node_kws, edge_kws = {"s":100, "zorder":2}, {"zorder":1}
w_kws = {"edge_kws":edge_kws, "node_kws":node_kws}
ntw.w_network.plot(arcs_df, indexed_on="id", ax=base, **w_kws)
vertices_df.plot(ax=base, fc="r", ec="k", markersize=50, zorder=2)
# arc labels
arc_labels(arcs_df, base, 12)
# vertex labels
vert_labels(vertices_df, base, 14)
In [36]:
U = [Chain([Point((2,2)), Point((2,1)), Point((4,1)), Point((4,2))])]
I = [Chain([Point((3,0)), Point((3,1))])]
arcs = U + I
ntw = spaghetti.Network(in_data=arcs)
ntw.full_distance_matrix(n_processes=1, gen_tree=True)
In [37]:
ntw.network_component_vertices
Out[37]:
In [38]:
ntw.distance_matrix
Out[38]:
From vertex 0
only vertices {1,2,3}
can be reached. This is shown in the distance matrix by inf
values in the 4th and 5th columns, and by identical origin-destination shortest paths in the network trees.
In [39]:
ntw.distance_matrix[0]
Out[39]:
In [40]:
ntw.network_trees[0]
Out[40]:
Like the example in cell 21
above, by inserting a vertex complete network connectivity can be achieved.
In [41]:
U = [Chain([Point((2,2)), Point((2,1)), Point((3,1)), Point((4,1)), Point((4,2))])]
I = [Chain([Point((3,0)), Point((3,1))])]
arcs = U + I
ntw = spaghetti.Network(in_data=arcs)
ntw.full_distance_matrix(n_processes=1, gen_tree=True)
In [42]:
ntw.network_component_vertices
Out[42]:
In [43]:
ntw.distance_matrix
Out[43]:
In [44]:
ntw.distance_matrix[0]
Out[44]:
In [45]:
ntw.network_trees[0]
Out[45]: