ONS Register of Geographic Codes

The ONS Register of Geographic Codes is a register maintained by the Office of National Statistics (ONS) that describes a range of coding schemes for various UK geographies.

The register is provided as an Excel spreadsheet cointing multiple worksheets, with each worksheet contianing the codes for a particular geography.

This notebook parses the codes into a set of unnormalised relational database tables in a SQLite3 database.

See also: GSS blog - Why Do We Need Another Register?, which describes the development of a new set of official registers for describing UK geographies.

 Accessing the Data

The ONS Register of Geographic Codes is updated several times a year. Finf the most current version here.


In [172]:
import sqlite3
con = sqlite3.connect("onsgeocodes.sqlite")

In [173]:
import pandas as pd

In [174]:
#Create a function to grab a zip file from an online location and then grab a specified file from inside it
import requests, zipfile

#The following fudge copes with Python 2 and Python 3
try:
    from StringIO import StringIO as zreader
except ImportError:
    from io import BytesIO as zreader

def ziparchivereader(f):
    return zipfile.ZipFile(f, 'r')

def zipgrabber(url):
    ''' Grab a zip file from a URL '''
    r = requests.get(url)
    z = ziparchivereader(zreader(r.content))
    #z = zipfile.ZipFile(zreader(r.content))
    return z

def zipgrabberfile(url, f):
    ''' Grab a file by name from a zip file grabbed from a URL '''
    return zipgrabber(url).open(f)

def zipfilebyname(z,f):
    ''' Grab a file by name from an already grabbed zip file '''
    return z.open(f)

def zipfilelist(z):
    ''' Return the names of files contained in a grabbed zip file '''
    return z.namelist()

In [175]:
url="https://ons.maps.arcgis.com/sharing/rest/content/items/ef6d5454a6f44ea3a7387b3401e2de5e/data"

!mkdir -p downloads
f='downloads/tmp_register_ons_geo_codes.zip'
!rm {f}
!wget -O {f} {url}


--2017-09-06 14:25:28--  https://ons.maps.arcgis.com/sharing/rest/content/items/ef6d5454a6f44ea3a7387b3401e2de5e/data
Resolving ons.maps.arcgis.com... 23.21.231.12, 23.21.174.157, 23.21.131.9
Connecting to ons.maps.arcgis.com|23.21.231.12|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://ago-item-storage.s3-external-1.amazonaws.com/ef6d5454a6f44ea3a7387b3401e2de5e/Register_of_Geographic_Codes_%28June_2017%29_UK.zip?X-Amz-Security-Token=FQoDYXdzEF0aDIOaUBDdDs4LweZLeSKcAzNBVPyHYctLdlG2KnTw2MZG%2Fb7JgMXCsoEwHyFllg84nrHRaio6kSbcRXdvU6roG8FvGbnPFEheMv63li4PDSoeCN94YaiSLwyya3oNWvlodqwM5HnRzJeJ3BU99yGTRXxrUNaZyvX1%2FBijwW4H0CeO2qLIhZHicwTDfZbVwy1IJY5OUgnIhBYygpHTDF673EuW%2FxSNdseh09FjIdJOUjnVfHQSq%2B6E%2BH9nnUOXKzNi05Ac%2Fxl%2F6%2FOHdLem4sxVEmc00Q45IWhjVIzBsqB6lljPrM6lrGebGue8dtb7JOJRw3uxGxBXHsspYIVBTzXkp3dSkur68V6mzMoIn3GeDwJxdcrA5GZxXFrAMTV1bH2l61E5HQZU%2BwBbqTSHHZ56756ftysLF4G8LRuYMoBUOrJm2y%2BbVYSJt2l2v22Ve5wO4PRNn4R6nnHkOw86lLAO5NUHmhf1tqmCKVGYHEOSEmvLDSVFWkCejOjBawcBJhV0ynt2mooGwdHHqSoqjrF0H2%2BvXfJnf%2B3%2BOcolPkl1Np3NcgbCYW0HiJ51HyUo2Mm%2FzQU%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20170906T132529Z&X-Amz-SignedHeaders=host&X-Amz-Expires=300&X-Amz-Credential=ASIAIZGIXNYZ6NQEDLSA%2F20170906%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=179bdab49e2568f73ad75cf1c76645987cfbd020f8413603c66347feefc2bdac [following]
--2017-09-06 14:25:29--  https://ago-item-storage.s3-external-1.amazonaws.com/ef6d5454a6f44ea3a7387b3401e2de5e/Register_of_Geographic_Codes_%28June_2017%29_UK.zip?X-Amz-Security-Token=FQoDYXdzEF0aDIOaUBDdDs4LweZLeSKcAzNBVPyHYctLdlG2KnTw2MZG%2Fb7JgMXCsoEwHyFllg84nrHRaio6kSbcRXdvU6roG8FvGbnPFEheMv63li4PDSoeCN94YaiSLwyya3oNWvlodqwM5HnRzJeJ3BU99yGTRXxrUNaZyvX1%2FBijwW4H0CeO2qLIhZHicwTDfZbVwy1IJY5OUgnIhBYygpHTDF673EuW%2FxSNdseh09FjIdJOUjnVfHQSq%2B6E%2BH9nnUOXKzNi05Ac%2Fxl%2F6%2FOHdLem4sxVEmc00Q45IWhjVIzBsqB6lljPrM6lrGebGue8dtb7JOJRw3uxGxBXHsspYIVBTzXkp3dSkur68V6mzMoIn3GeDwJxdcrA5GZxXFrAMTV1bH2l61E5HQZU%2BwBbqTSHHZ56756ftysLF4G8LRuYMoBUOrJm2y%2BbVYSJt2l2v22Ve5wO4PRNn4R6nnHkOw86lLAO5NUHmhf1tqmCKVGYHEOSEmvLDSVFWkCejOjBawcBJhV0ynt2mooGwdHHqSoqjrF0H2%2BvXfJnf%2B3%2BOcolPkl1Np3NcgbCYW0HiJ51HyUo2Mm%2FzQU%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20170906T132529Z&X-Amz-SignedHeaders=host&X-Amz-Expires=300&X-Amz-Credential=ASIAIZGIXNYZ6NQEDLSA%2F20170906%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Signature=179bdab49e2568f73ad75cf1c76645987cfbd020f8413603c66347feefc2bdac
Resolving ago-item-storage.s3-external-1.amazonaws.com... 54.231.49.25
Connecting to ago-item-storage.s3-external-1.amazonaws.com|54.231.49.25|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 16220221 (15M) [application/zip]
Saving to: 'downloads/tmp_register_ons_geo_codes.zip'

downloads/tmp_regis 100%[=====================>]  15.47M  5.71MB/s   in 2.7s   

2017-09-06 14:25:33 (5.71 MB/s) - 'downloads/tmp_register_ons_geo_codes.zip' saved [16220221/16220221]


In [176]:
zf=ziparchivereader(f)
zipfilelist(zf)


Out[176]:
['RGC_JUN_2017_UK.csv', 'RGC_JUN_2017_UK.xlsx']

The .csv file contains metadata describing the geographies listed in separate sheets in the .xlsx file.


In [177]:
metadata=pd.read_csv(zf.open('RGC_JUN_2017_UK.csv'))
metadata.head()


Out[177]:
Entity code Entity name Entity abbreviation Entity theme Entity coverage Related entity codes Status Number of live instances Number of archived instances Number of cross-border instances Date of last instance change Current code (first in range) Current code (last in range) Reserved code (for CHD use) Entity owner Date entity introduced on RGC Entity start date
0 E92 Country CTRY Administrative England W92, S92, N92, L93, M83 Current 1 0 0 n/a E92000001 E92000001 E92999999 ONS 01/01/2009 01/01/2009
1 E00 Output Areas OA Statistical Building Block England W00, S00, N00 Current 171,372 7,582 0 31/12/2011 E00000001 E00176774 E00999999 ONS 01/01/2009 01/01/2009
2 E01 Super Output Areas, Lower Layer LSOA Statistical Building Block England W01, S01 Current 32,844 1,357 0 31/12/2011 E01000001 E01033768 E01999999 ONS 01/01/2009 01/01/2003
3 E02 Super Output Areas, Middle Layer MSOA Statistical Building Block England W02, S02 Current 6,791 141 0 31/12/2011 E02000001 E02006934 E02999999 ONS 01/01/2009 01/01/2003
4 E04 Civil Parishes PAR Administrative England n/a Current 10,459 2,329 0 01/04/2017 E04000001 E04012737 E04999999 DCLG 01/01/2009 01/01/2009

Extracting Geography Codes


In [178]:
xl=pd.read_excel(zf.open('RGC_JUN_2017_UK.xlsx'), sheetname=None)
xl.keys()


Out[178]:
odict_keys(['RGC', 'Metadata_for_geography_listings', 'E00_OA', 'E01_LSOA', 'E02_MSOA', 'E04_PAR', 'E05_WD', 'E06_UA', 'E07_NMD', 'E08_MD', 'E09_LONB', 'E10_CTY', 'E11_MCTY', 'E12_RGN', 'E13_IOL', 'E14_WPC', 'E15_EER', 'E20_CANREG', 'E22_CSP', 'E23_PFA', 'E25_PUA', 'E26_NPARK', 'E28_REGD', 'E29_REGSD', 'E30_TTWA', 'E31_FRA', 'E32_LAC', 'E33_WZ', 'E34_BUA', 'E35_BUASD', 'E36_CMWD', 'E37_LEP', 'E38_CCG', 'E39_NHSRLO', 'E40_NHSCR', 'E41_CMLAD', 'E42_CMCTY', 'E43_NCP', 'E45_PHEC', 'E46_PHEREG', 'E47_CAUTH', 'E48_LRF', 'E49_EZ', 'E50_WA', 'E51_DC', 'E52_LEPOP', 'E53_LEPNOP', 'E54_STP', 'E55_SCN', 'E56_CAL', 'E57_NCV', 'E92_CTRY', 'J01_TCITY', 'K01_TTWA', 'K02_UK', 'K03_GB', 'K04_EW', 'K05_BUA', 'K06_BUASD', 'L00_SHA', 'L93_BCD', 'M00_SHA', 'M01_PHD', 'M83_BCD', 'N00_SA', 'N06_WPC', 'N07_EER', 'N08_WD', 'N09_LGD', 'N10_DEA', 'N11_SETT2015', 'N12_TTWA', 'N19_WZ', 'N23_PFA', 'N24_PFD', 'N31_NIFRS', 'N32_NIFRSA', 'N33_NIFRSD', 'N92_CTRY', 'For_Scotland', 'W00_OA', 'W01_LSOA', 'W02_MSOA', 'W03_USOA', 'W04_COM', 'W05_WD', 'W06_UA', 'W07_WPC', 'W08_EER', 'W09_NAWC', 'W10_NAWER', 'W11_LHB', 'W12_CANREG', 'W13_CANNET', 'W14_CSP', 'W15_PFA', 'W16_DCELLS', 'W18_NPARK', 'W19_NAER', 'W20_REGD', 'W21_REGSD', 'W22_TTWA', 'W23_SPA', 'W24_SPSA', 'W25_FRA', 'W26_SRA', 'W27_SRASub', 'W28_TCA', 'W29_AgricReg', 'W30_AgricSmall', 'W31_NonNPARK', 'W32_NonSRA', 'W33_CFA', 'W34_NonCFA', 'W35_WZ', 'W36_PSCReg', 'W37_BUA', 'E38_BUASD', 'W39_CMWD', 'W40_CMLAD', 'W41_LRF', 'W92_CTRY'])

The register appears to describe only a subset of the codesets listed in the metadata file - codes relating specifically to Scotland are not provided.


In [179]:
metadata["codeAbbrv"] = metadata["Entity code"].map(str) + '_' + metadata["Entity abbreviation"].map(str)
codes = metadata["codeAbbrv"].tolist()
print(set(codes) - set(xl.keys()))


{'S07_RTP', 'W38_BUASD', 'S00_OA', 'S13_WD', 'S39_SFRLSO', 'S14_WPC', 'S16_SPC', 'S11_SDPA', 'S19_LOC', 'S23_PFA', 'S31_LAU2', 'W16_DCELL', 'K04_E&W', 'E24_LLSC', 'S21_NPARK', 'S25_CJA', 'S34_WZ', 'S36_ISLG', 'E27_NDC', 'S27_ISDT', 'S92_CTRY', 'S12_CA', 'S10_URC', 'S24_HIE', 'E16_PCT', 'S09_ER', 'S01_DZ', 'E18_SHA', 'E19_PSHA', 'S40_SFRSDA', 'S06_ROAL', 'S33_BRMA', 'S04_ROAS', 'E40_NHSER', 'S37_HIA', 'S02_IZ', 'S32_SPD', 'S20_SETT', 'S28_CDC', 'S08_HB', 'S03_CHP', 'S05_ROAC', 'S26_CHCP', 'S38_SFRS', 'S22_TTWA', 'S35_CVP', 'E21_CANNET', 'S30_LAU1', 'S17_SPR', 'S29_CLC', 'E17_CT', 'S15_EER'}

Metadata for the spreadsheet is contained in the RGC sheet.


In [180]:
xl['RGC']["codeAbbrv"] = xl['RGC']["Entity code"].map(str) + '_' + xl['RGC']["Entity abbreviation"].map(str)

xl['RGC'].to_sql(con=con, name='metadata', index=False, if_exists='replace')


/usr/local/lib/python3.6/site-packages/pandas/core/generic.py:1362: UserWarning: The spaces in these column names will not be changed. In pandas versions < 0.14, spaces were converted to underscores.
  chunksize=chunksize, dtype=dtype)

In [182]:
cols=['GEOGCD','GEOGNM','GEOGNMW','OPER_DATE','TERM_DATE','STATUS']

bigcodes=pd.DataFrame(columns=['sheet']+cols)
bigcodes.to_sql(con=con, name='codelist', index=False, if_exists='replace')

sheets= list(xl.keys())
sheets.remove('For_Scotland')
for sheet in sheets[2:]:
    xl[sheet].to_sql(con=con, name=sheet, index=False, if_exists='replace')
    xl[sheet]['sheet']=sheet
    #Reorder the columns
    xl[sheet][['sheet']+cols].to_sql(con=con, name='codelist', index=False, if_exists='append')

In [185]:
q='SELECT * FROM E02_MSOA LIMIT 5'
pd.read_sql_query(q, con)


Out[185]:
GEOGCD GEOGNM GEOGNMW SI_ID SI_TITLE OPER_DATE TERM_DATE PARENTCD ENTITYCD OWNER STATUS AREAEHECT AREACHECT AREAIHECT AREALHECT
0 E02000001 City of London 001 None None None 2004-08-01 00:00:00 None None E02 ONS live 315.14 289.78 0.0 289.78
1 E02000002 Barking and Dagenham 001 None None None 2004-08-01 00:00:00 None None E02 ONS live 216.15 216.15 0.0 216.15
2 E02000003 Barking and Dagenham 002 None None None 2004-08-01 00:00:00 None None E02 ONS live 214.15 214.15 0.0 214.15
3 E02000004 Barking and Dagenham 003 None None None 2004-08-01 00:00:00 None None E02 ONS live 249.28 249.28 0.0 249.28
4 E02000005 Barking and Dagenham 004 None None None 2004-08-01 00:00:00 None None E02 ONS live 118.81 118.81 0.0 118.81

In [183]:
q='SELECT * FROM codelist WHERE "GEOGCD"="{code}"'.format(code='W40000004')
pd.read_sql_query(q, con)


Out[183]:
sheet GEOGCD GEOGNM GEOGNMW OPER_DATE TERM_DATE STATUS
0 W40_CMLAD W40000004 Denbighshire Sir Ddinbych 2011-12-31 00:00:00 None live

In [184]:
q='''
SELECT *  FROM codelist JOIN metadata 
WHERE "GEOGNM"="{name}" AND codeAbbrv=sheet AND codelist.STATUS="live"
'''.format(name='Isle of Wight')
pd.read_sql_query(q, con)


Out[184]:
sheet GEOGCD GEOGNM GEOGNMW OPER_DATE TERM_DATE STATUS Entity code Entity name Entity abbreviation ... Number of archived instances Number of cross-border instances Date of last instance change Current code (first in range) Current code (last in range) Reserved code (for CHD use) Entity owner Date entity introduced on RGC Entity start date codeAbbrv
0 E06_UA E06000046 Isle of Wight None 2009-01-01 00:00:00 None live E06 Unitary Authorities UA ... 1 0 2013-04-01 00:00:00 E06000001 E06000057 E06999999 DCLG 2009-01-01 00:00:00 2009-01-01 00:00:00 E06_UA
1 E14_WPC E14000762 Isle of Wight None 2010-05-06 00:00:00 None live E14 Westminster Parliamentary Constituencies WPC ... 529 0 2010-05-06 00:00:00 E14000530 E14001062 E14999999 LGBC 2009-01-01 00:00:00 2009-01-01 00:00:00 E14_WPC
2 E22_CSP E22000116 Isle of Wight None 2009-01-01 00:00:00 None live E22 Community Safety Partnerships CSP ... 81 0 2016-04-01 00:00:00 E22000001 E22000374 E22999999 Home Office 2009-01-01 00:00:00 2009-01-01 00:00:00 E22_CSP
3 E28_REGD E28000146 Isle of Wight None 2009-01-01 00:00:00 None live E28 Registration Districts REGD ... 67 0 2013-01-17 00:00:00 E28000001 E28000219 E28999999 ONS 2009-01-01 00:00:00 2009-01-01 00:00:00 E28_REGD
4 E30_TTWA E30000070 Isle of Wight None 2009-01-01 00:00:00 None live E30 Travel to Work Areas TTWA ... 206 6 2015-07-31 00:00:00 E30000004 E30000294 E30999999 ONS 2009-01-01 00:00:00 2009-01-01 00:00:00 E30_TTWA
5 E31_FRA E31000021 Isle of Wight None 2009-01-01 00:00:00 None live E31 Fire and Rescue Authorities FRA ... 10 0 2016-08-01 00:00:00 E31000001 E31000047 E31999999 DCLG 2009-01-01 00:00:00 2009-01-01 00:00:00 E31_FRA
6 E41_CMLAD E41000046 Isle of Wight None 2011-12-31 00:00:00 None live E41 Census Merged Local Authority Districts CMLAD ... 0 0 n/a E41000001 E41000324 E41999999 ONS 2013-04-01 00:00:00 2011-12-31 00:00:00 E41_CMLAD

7 rows × 25 columns


In [ ]:
q='''
SELECT DISTINCT "Entity name", sheet  FROM codelist JOIN metadata 
WHERE "GEOGNM" LIKE "%{name}%" AND codeAbbrv=sheet AND codelist.STATUS="live"
'''.format(name='Isle of Wight')
pd.read_sql_query(q, con)

In [ ]:
#Lookup a code
q='''
SELECT * FROM codelist JOIN metadata 
WHERE "GEOGCD" = "{name}" AND codeAbbrv=sheet 
'''.format(name='E05008479')
pd.read_sql_query(q, con)

In [ ]: