In [1]:
import seaborn as sns
import metapack as mp
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import display 

%matplotlib inline
sns.set_context('notebook')

In [2]:
pkg = mp.jupyter.open_package()
#pkg = mp.jupyter.open_source_package()
pkg


Out[2]:

LAHSA Homeless Count

lahsa.org-homeless_count-1 Last Update: 2018-10-15T20:50:45

__

There are notes in the source Excel file, but they are embedded as an image, so these following notes are from OCR:

Data Prepared by Los Angeles Homeless Services Authority

Last updated 08/23/2018

Components of the Homeless Count

Street Count (all census tracts): Captures a point in time estimate of the
unsheltered population.

Shelter Count (from Homeless Management Integration System): Captures the
homeless population in emergency shelters, transitional housing, and safe
havens. The shelter count in this dataset excludes Cal WORKS hotel/ motel
vouchers and domestic violence shelters for confidentiality reasons.

Youth Count (sample census tracts): Collaborative process with youth
stakeholders to better understand and identify homeless youth.

Demographic Survey (sample census tracts): Captures the demographic
characteristics of the unsheltered homeless population.

Notes

LAHSA does not recommend aggregating census tract-level data to calculate
numbers for other geographic levels. Due to rounding, the census tract-level
data may not add up to the total for Los Angeles City Council District,
Supervisorial District, Service Planning Area, or the Los Angeles Continuum of
Care.

The Los Angeles Continuum of Care does not include the Cities of Long Beach,
Glendale, and Pasadena and will not equal the countywide Homeless Count Total.

Street Count Data include persons found outside, including persons found living
in cars, vans, ca mpers/RVs, tents, and makeshift shelters. The conversion
factors used to estimate the number of persons found Iivi ng outside are the
following: For fa milies- Makeshift Shelter = 2.42, Car = 2.96, Van = 2.43, Ca
mper/RV = 3.35, Tent = 2.75; For Individuals- Makeshift Shelter 1.67, Car =
1.54, Van = 1.62, Camper/RV = 1.76, Tent = 1.52.

Please visit https://www.lahsa.org/homeless-count/home to view and download
data.

Last updated 08/23/2018

Eagle Rock/Arroyo Seco Boundaries were updated

Contacts

Resources


In [3]:
df = pkg.resource('tracts').dataframe()
df.head()


Out[3]:
tract year city lacity community_name detailed_name spa sd cd ca_ssd ... totshyouthsingyouth totshyouthfamhh totshyouthfammem totshyouthunaccyouth totunsheltpeople totespeople totthpeople totshpeople totsheltpeople totpeople
0 14000US06037101110 2018 Los Angeles 1 Sunland-Tujunga Sunland-Tujunga NC 2 5 7 25 ... 0 0 0 0 15.884 0 0 0 0 15.884
1 14000US06037101122 2018 Los Angeles 1 Sunland-Tujunga Sunland-Tujunga NC 2 5 7 25 ... 0 0 0 0 3.618 0 0 0 0 3.618
2 14000US06037101210 2018 Los Angeles 1 Sunland-Tujunga Sunland-Tujunga NC 2 5 7 25 ... 0 0 0 0 22.285 0 0 0 0 22.285
3 14000US06037101220 2018 Los Angeles 1 Sunland-Tujunga Sunland-Tujunga NC 2 5 7 25 ... 0 0 0 0 14.406 0 0 0 0 14.406
4 14000US06037101300 2018 Los Angeles 1 Sunland-Tujunga Sunland-Tujunga NC 2 5 7 25 ... 0 0 0 0 7.324 0 0 0 0 7.324

5 rows × 72 columns


In [4]:
pdb_pkg = mp.open_package('http://library.metatab.org/sandiegodata.org-planning-tracts-6.zip')
pdb_pkg


Out[4]:

County Planning Database

sandiegodata.org-planning-tracts-6 Last Update: 2018-10-15T20:23:35

A collection of data for demographics and housing from the Census planning database. Files are broken into counties, for San Diego and Los Angeles

The Planing Database is a Census product that combines a range of data from the American Community Survey and 2010 Decenial Census into a single file, with one row per census tract. This version of the file includes only tracts in San Diego County. The file is linked to ACS format geoids to identify tracts, so it can be easily linked to other tracts data. This data package includes links to two such files, one for San Diego communities, and one for tract geographics.

The planning database has about 450 columns. For full definitions of the columns, refer to the upstream documentation for the source file. In general, the column names in the documentation must be lowercased for use with the file in this data package.

In Python, use metapack to open the data package.

import metapack as mp
pkg = mp.open_package('http://library.metatab.org/sandiegodata.org-planning-tracts-1.zip')

To display a simple map, link in the tract boundaries from the communities dataset and use the Geopandas .plot() function. The column argument names the column to use for coloring regions. Note that the column name is copied from the documentation with mixed case, then lowercased to index the dataset.

tracts = pkg.reference('communities').geoframe().set_index('geoid').fillna('')
df = tracts.join(cpdb)

fig, ax = plt.subplots(1, figsize=(15,10))
df.plot(ax=ax, column='pct_MLT_U10p_ACS_12_16'.lower())

After linking in the communities, you can also use the community id columns to group by city or community:

seniors = dfg.df.groupby('city_name').pop_65plus_acs_12_16.sum()
  1. First Release
  2. Clean more currency columns

Documentation Links

Contacts

Resources

References


In [ ]: