Finding stories in data with Python and Jupyter notebooks

Journocoders London, April 13, 2017

David Blood/@davidcblood/[first] dot [last] at ft.com

Introduction

The Jupyter notebook provides an intuitive, flexible and shareable way to work with code. It's a potentially invaluable tool for journalists who need to analyse data quickly and reproducibly, particularly as part of a graphics-oriented workflow.

This aim of this tutorial is to help you become familiar with the notebook and its role in a Python data analysis toolkit. We'll start with a demographic dataset and explore and analyse it visually in the notebook to see what it can tell us about people who voted ‘leave’ in the UK's EU referendum. To finish, we'll output a production-quality graphic using Bokeh.

You'll need access to an empty Python 3 Jupyter notebook, ideally running on your local machine, although a cloud-based Jupyter environment is fine too.

You're ready to start the tutorial when you're looking at this screen:

1. Bring your data into the notebook

In Python-world, people often use the pandas module for working with data. You don't have to—there are other modules that do similar things—but it's the most well-known and comprehensive (probably).

Let's import pandas into our project and assign it to the variable pd, because that's easier to type than pandas. While we're at it, let's import all the other modules we'll need for this tutorial and also let Matplotlib know that we want it to plot charts here in the notebook rather than in a separate window. Enter the following code into the first cell in your notebook and hit shift-return to run the code block—don't copy-and-paste it. The best way to develop an understanding of the code is to type it out yourself:



In [1]:

    
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
from bokeh.plotting import figure, show
from bokeh.io import output_notebook

%matplotlib inline

There shouldn't be any output from that cell, but if you get any error messages, it's most likely because you don't have one or more of these modules installed on your system. Running

pip3 install pandas matplotlib numpy seaborn bokeh

from the command line should take care of that. If not, holler and I'll try to help you.

As well as running your code, hitting shift-return in that first cell should have automatically created an empty cell below it. In that cell, we're going to use the read_csv method provided by pandas to, um, read our CSV.

When pandas reads data from a CSV file, it automagically puts it into something called a dataframe. It's not important at this point to understand what a dataframe is or how it differs from other Python data structures. All you need to know for now is that it's an object containing structured data that's stored in memory for the duration of your notebook session.

We'll also assign our new dataframe to another variable—df—so we can do things with it down the line.

We do all of this like so (remember to hit shift-return):



In [2]:

    
url = 'https://raw.githubusercontent.com/davidbjourno/finding-stories-in-data/master/data/leave-demographics.csv'

# Pass in the URL of the CSV file:
df = pd.read_csv(url)

See how easy that was? Now let's check that df is in fact a dataframe. Using the .head(n=[number]) method on any dataframe will return the first [number] rows of that dataframe. Let's take a look at the first ten:



In [3]:

    
df.head(n=10)









    Out[3]:






  
    
      
      ons_id
      name
      region_name
      electorate
      result
      leave
      turnout
      votes_cast
      var1
      var2
      var3
      var4
      var5
      var6
      var7
      var8
      var9
    
  
  
    
      0
      E06000001
      Hartlepool
      North East
      70341
      leave
      69.6
      65.6
      46134
      48.8
      23.9
      49.5
      15.9
      17.6
      495.2
      30.5
      25.6
      13.0
    
    
      1
      E06000002
      Middlesbrough
      North East
      94612
      leave
      65.5
      64.9
      61393
      48.5
      25.0
      66.1
      42.6
      18.5
      467.6
      64.9
      22.6
      13.3
    
    
      2
      E06000003
      Redcar and Cleveland
      North East
      103529
      leave
      66.2
      70.3
      72741
      16.8
      22.3
      25.0
      64.0
      18.9
      505.5
      18.6
      23.8
      13.2
    
    
      3
      E06000004
      Stockton-on-Tees
      North East
      141486
      leave
      61.7
      71.0
      100460
      24.4
      25.2
      58.3
      17.2
      22.8
      502.5
      45.3
      20.5
      15.9
    
    
      4
      E06000005
      Darlington
      North East
      77662
      leave
      56.2
      71.1
      55195
      23.8
      24.6
      55.6
      50.5
      23.7
      484.0
      29.9
      21.7
      15.2
    
    
      5
      E06000006
      Halton
      North West
      95289
      leave
      57.4
      68.3
      65047
      19.9
      25.4
      30.2
      47.7
      17.6
      484.7
      46.1
      21.0
      11.9
    
    
      6
      E06000007
      Warrington
      North West
      157042
      leave
      54.3
      73.4
      115205
      4.6
      26.0
      35.4
      27.5
      27.4
      545.6
      33.7
      15.5
      18.1
    
    
      7
      E06000008
      Blackburn with Darwen
      North West
      100117
      leave
      56.3
      65.3
      65408
      21.2
      26.8
      6.5
      31.8
      19.8
      413.0
      51.6
      18.2
      14.0
    
    
      8
      E06000009
      Blackpool
      North West
      102354
      leave
      67.5
      65.4
      66959
      43.9
      23.3
      45.9
      15.2
      15.5
      392.4
      46.6
      24.4
      10.3
    
    
      9
      E06000010
      Hull
      Yorkshire and The Humber
      180230
      leave
      67.6
      62.9
      113436
      60.9
      28.2
      7.7
      31.0
      15.2
      445.9
      31.4
      26.9
      10.5

Looks good!

(FYI: .tail(n=[number]) will give you the last [number] rows.)

By now, you may have noticed that some of the row headers in this CSV aren't particularly descriptive (var1, var2 etc.). This is the game: by the end of this tutorial, you should be able to identify the variables that correlated most strongly with the percentage of ‘leave’ votes (the leave column), i.e. which factors were the most predictive of people voting ‘leave’. At the end of the meetup, before we all go down the pub, you can tell me which variables you think correlated most strongly and I'll tell you what each of them are 😁

2. Explore the data

The main advantage of the workflow we're using here is that it enables us to inspect a dataset visually, which can often be the quickest way to identify patterns, trends or outliers in data. A common first step in this process is to use scatter plots to visualise the relationship, if any, between two variables. So let's use Matplotlib to create a first, super basic scatter plot:



In [4]:

    
# Configure Matplotlib's pyplot method (plt) to plot at a size of 8x8 inches and
# a resolution of 72 dots per inch
plt.figure(
    figsize=(8, 8),
    dpi=72
)

# Plot the data as a scatter plot
g = plt.scatter(
    x=df['var1'], # The values we want to plot along the x axis
    y=df['leave'], # The values we want to plot along the y axis
    s=50, # The size…
    c='#0571b0', # …colour…
    alpha=0.5 # …and opacity we want the data point markers to be
)

Yikes, not much of a relationship there. Let's try a different variable:



In [5]:

    
plt.figure(
    figsize=(8, 8),
    dpi=72
)

g = plt.scatter(
    x=df['var2'], # Plotting var2 along the x axis this time
    y=df['leave'],
    s=50,
    c='#0571b0',
    alpha=0.5
)

Hmm, that distribution looks better—there's a stronger, negative correlation there—but it's still a little unclear what we're looking at. Let's add some context.

We know from our provisional data-munging (that we didn't do) that many of the boroughs of London were among the strongest ‘remain’ areas in the country. We can add an additional column called is_london to our dataframe and set the values of that column to either True or False depending on whether the value in the row's region_name column is London:



In [6]:

    
df['is_london'] = np.where(df['region_name'] == 'London', True, False)

# Print all the rows in the dataframe in which is_london is equal to True
df[df['is_london'] == True]









    Out[6]:






  
    
      
      ons_id
      name
      region_name
      electorate
      result
      leave
      turnout
      votes_cast
      var1
      var2
      var3
      var4
      var5
      var6
      var7
      var8
      var9
      is_london
    
  
  
    
      293
      E09000001
      City of London
      London
      5987
      remain
      24.7
      73.6
      4405
      65.7
      37.2
      37.4
      25.2
      68.4
      864.7
      27.6
      3.2
      39.9
      True
    
    
      294
      E09000002
      Barking and Dagenham
      London
      115812
      leave
      62.4
      63.9
      73941
      38.6
      31.2
      34.5
      18.2
      20.9
      534.5
      43.2
      16.0
      13.3
      True
    
    
      295
      E09000003
      Barnet
      London
      223467
      remain
      37.8
      72.1
      161218
      20.3
      31.9
      49.7
      30.1
      40.3
      629.8
      66.4
      5.6
      24.7
      True
    
    
      296
      E09000004
      Bexley
      London
      170779
      leave
      63.0
      75.3
      128570
      5.5
      26.5
      29.8
      56.1
      21.8
      612.7
      31.6
      12.3
      14.7
      True
    
    
      297
      E09000005
      Brent
      London
      186793
      remain
      40.3
      65.1
      121671
      35.8
      34.3
      50.3
      12.9
      33.3
      551.4
      30.8
      5.1
      18.2
      True
    
    
      298
      E09000006
      Bromley
      London
      231473
      remain
      49.4
      78.9
      182570
      11.5
      27.5
      42.1
      62.2
      33.1
      681.0
      65.0
      9.6
      21.4
      True
    
    
      299
      E09000007
      Camden
      London
      145425
      remain
      25.1
      65.5
      95281
      29.0
      37.5
      44.9
      5.3
      50.5
      719.8
      31.6
      4.8
      31.9
      True
    
    
      300
      E09000008
      Croydon
      London
      245349
      remain
      45.7
      69.8
      171289
      13.8
      29.7
      37.7
      12.2
      31.8
      602.8
      12.4
      10.4
      19.6
      True
    
    
      301
      E09000009
      Ealing
      London
      212991
      remain
      39.6
      70.1
      149268
      30.8
      33.9
      66.4
      18.2
      37.0
      562.2
      34.4
      6.2
      21.0
      True
    
    
      302
      E09000010
      Enfield
      London
      198387
      remain
      44.2
      69.1
      137056
      14.5
      29.9
      47.3
      68.2
      28.7
      564.0
      10.6
      8.6
      19.0
      True
    
    
      303
      E09000011
      Greenwich
      London
      168967
      remain
      44.4
      69.5
      117470
      60.6
      35.0
      23.2
      56.6
      33.2
      601.0
      62.0
      12.1
      20.4
      True
    
    
      304
      E09000012
      Hackney
      London
      163284
      remain
      21.5
      65.2
      106422
      61.6
      43.7
      66.7
      56.8
      41.8
      603.4
      20.1
      6.8
      25.0
      True
    
    
      305
      E09000013
      Hammersmith and Fulham
      London
      114863
      remain
      30.0
      70.0
      80347
      48.9
      40.6
      59.9
      53.6
      49.6
      686.1
      23.6
      5.5
      27.0
      True
    
    
      306
      E09000014
      Haringey
      London
      150098
      remain
      24.4
      70.6
      106032
      30.1
      38.6
      18.2
      36.8
      40.8
      576.6
      17.2
      6.2
      23.9
      True
    
    
      307
      E09000015
      Harrow
      London
      162397
      remain
      45.4
      72.3
      117352
      5.5
      30.4
      41.2
      51.5
      36.8
      625.7
      32.5
      5.3
      23.2
      True
    
    
      308
      E09000016
      Havering
      London
      183082
      leave
      69.7
      76.0
      139175
      61.2
      26.1
      3.5
      21.8
      19.4
      626.3
      22.5
      12.3
      14.1
      True
    
    
      309
      E09000017
      Hillingdon
      London
      193033
      leave
      56.4
      69.0
      133170
      27.6
      30.6
      46.9
      20.7
      28.0
      605.5
      43.6
      9.5
      17.1
      True
    
    
      310
      E09000018
      Hounslow
      London
      165050
      remain
      48.9
      69.8
      115208
      40.9
      35.1
      13.7
      40.6
      34.6
      565.7
      24.9
      8.2
      18.5
      True
    
    
      311
      E09000019
      Islington
      London
      144514
      remain
      24.8
      70.4
      101723
      52.1
      41.7
      9.6
      56.8
      48.1
      670.8
      37.2
      7.1
      31.3
      True
    
    
      312
      E09000020
      Kensington and Chelsea
      London
      83042
      remain
      31.3
      66.0
      54801
      45.8
      35.5
      5.1
      43.0
      52.7
      762.8
      11.7
      3.2
      23.4
      True
    
    
      313
      E09000021
      Kingston-upon-Thames
      London
      108838
      remain
      38.4
      78.4
      85330
      32.1
      32.4
      44.8
      67.8
      41.4
      701.9
      39.6
      7.5
      25.2
      True
    
    
      314
      E09000022
      Lambeth
      London
      210800
      remain
      21.4
      67.4
      142162
      53.8
      44.2
      36.9
      19.5
      46.6
      621.5
      65.4
      7.1
      25.7
      True
    
    
      315
      E09000023
      Lewisham
      London
      197514
      remain
      30.1
      63.1
      124637
      61.1
      37.4
      12.8
      31.5
      38.0
      606.4
      39.1
      10.2
      22.6
      True
    
    
      316
      E09000024
      Merton
      London
      136352
      remain
      37.1
      73.5
      100207
      59.8
      35.5
      32.3
      42.4
      41.1
      622.4
      66.6
      7.8
      24.4
      True
    
    
      317
      E09000025
      Newham
      London
      176985
      remain
      47.2
      59.3
      104864
      49.3
      38.6
      14.1
      17.1
      30.1
      505.5
      67.9
      7.2
      14.8
      True
    
    
      318
      E09000026
      Redbridge
      London
      189843
      remain
      46.0
      67.6
      128397
      10.9
      32.2
      58.8
      53.0
      33.9
      652.5
      16.9
      6.5
      22.9
      True
    
    
      319
      E09000027
      Richmond-upon-Thames
      London
      132632
      remain
      30.7
      82.1
      108876
      4.8
      30.9
      27.6
      56.4
      53.0
      744.2
      53.6
      5.9
      29.7
      True
    
    
      320
      E09000028
      Southwark
      London
      195875
      remain
      27.2
      66.2
      129677
      34.1
      40.3
      9.3
      24.7
      43.1
      644.2
      23.0
      8.0
      25.8
      True
    
    
      321
      E09000029
      Sutton
      London
      140288
      leave
      53.7
      76.0
      106633
      23.6
      29.6
      6.3
      4.2
      30.1
      605.9
      45.9
      10.0
      19.2
      True
    
    
      322
      E09000030
      Tower Hamlets
      London
      167820
      remain
      32.5
      64.6
      108420
      44.6
      45.3
      30.3
      18.3
      41.0
      637.0
      16.9
      6.6
      25.7
      True
    
    
      323
      E09000031
      Waltham Forest
      London
      162983
      remain
      40.9
      66.7
      108689
      56.9
      35.5
      49.0
      65.6
      30.0
      546.1
      53.6
      8.7
      17.9
      True
    
    
      324
      E09000032
      Wandsworth
      London
      219521
      remain
      25.0
      72.0
      158018
      67.1
      45.1
      16.9
      34.2
      53.6
      718.5
      10.6
      6.2
      29.0
      True
    
    
      325
      E09000033
      Westminster
      London
      120524
      remain
      31.0
      65.0
      78325
      15.0
      40.0
      45.3
      64.8
      50.3
      764.4
      29.0
      3.6
      26.6
      True

Those names should look familiar. That's numpy's .where method coming in handy there to help us generate a new column of data based on the values of another column—in this case, region_name.

At this point, we're going to abandon Matplotlib like merciless narcissists and turn our attention to the younger, hotter Seaborn. Though it sounds like one of the factions from Game of Thrones, it's actually another plotting module that includes some handy analytical shortcuts and statistical methods. One of those analytical shortcuts is the FacetGrid.

If you've ever used OpenRefine, you're probably familiar with the concept of faceting. I'll fumblingly describe it here as a method whereby data is apportioned into distinct matrices according to the values of a single field. You get the idea. Right now, we're going to facet on the is_london column so that we can distinguish the London boroughs from the rest of the UK:



In [7]:

    
# Set the chart background colour (completely unnecessary, I just don't like the
# default)
sns.set_style('darkgrid', { 'axes.facecolor': '#efefef' })

# Tell Seaborn that what we want from it is a FacetGrid, and assign this to the
# variable ‘fg’
fg = sns.FacetGrid(
    data=df, # Use our dataframe as the input data
    hue='is_london', # Highlight the data points for which is_london == True
    palette=['#0571b0', '#ca0020'], # Define a tasteful blue/red colour combo
    size=7 # Make the plots size 7, whatever that means
)

# Tell Seaborn that what we want to do with our FacetGrid (fg) is visualise it
# as a scatter plot
fg.map(
    plt.scatter,
    'var2', # Values to plot along the x axis
    'leave', # Values to plot along the y axis
    alpha=0.5
)









    Out[7]:





<seaborn.axisgrid.FacetGrid at 0x1090bde10>

Now we're cooking with gas! We can see a slight negative correlation in the distribution of the data points and we can see how London compares to all the other regions of the country. Whatever var2 is, we now know that the London boroughs generally have higher levels of it than most of the rest of the UK, and that it has a (weak) negative correlation with ‘leave’ vote percentage.

So what's to stop you faceting on is_london but with a different variable plotted along the x axis? The answer is: nothing! Try doing that exact thing right now:



In [8]:

    
# Plot the chart above with a different variable along the x axis.

What's more, faceting isn't limited to just highlighting specific data points. We can also pass FacetGrid a col (column) argument with the name of a column that we'd like to use to further segment our data. So let's create another True/False (Boolean) column to flag the areas with the largest populations—the ones with electorates of 100,000 people or more—and plot a new facet grid:



In [9]:

    
df['is_largest'] = np.where(df['electorate'] >= 100000, True, False)

g = sns.FacetGrid(
    df,
    hue='is_london', 
    col='is_largest',
    palette=['#0571b0', '#ca0020'],
    size=7
)

g.map(
    plt.scatter,
    'var2',
    'leave',
    alpha=0.5
)









    Out[9]:





<seaborn.axisgrid.FacetGrid at 0x1098233c8>

Now we're able to make the following statements based solely on a visual inspection of this facet grid:

Most of the less populous areas (electorate < 100,000) voted ‘leave’
Most of the less populous areas had var2 levels below 35. Only two—both London boroughs—had levels higher than 35
There is a stronger correlation between the strength of the ‘leave’ vote and the level of var2 among the more populous areas

So you see how faceting can come in handy when you come to a dataset cold and need to start to understand it quickly.

As yet, we still don't have much of a story, just a few observations—not exactly Pulitzer material. The next and most important step is to narrow down which of the variables in the dataset were the most indicative of ‘leave’ vote percentage. The good news is that we don't have to repeat the facet grid steps above for every variable, because Seaborn provides another useful analytical shortcut called a PairGrid.

3. Optimise for efficiency

Apparently there's an equivalent to the pair grid in R called a correlogram or something (I wouldn't know). But the pair grid is super sweet because it allows us to check for correlations across a large number of variables at once. By passing the PairGrid function an array of column headers from our dataset, we can plot each of those variables against every other variable in one amazing ultra-grid:



In [10]:

    
# Just adding the first four variables, plus leave, to start with—you'll see why
columns = [
    'var1',
    'var2',
    'var3',
    'var4',
    'leave',
    'is_london'
]

g = sns.PairGrid(
    data=df[columns],
    hue='is_london',
    palette=['#0571b0', '#ca0020']
)

g.map_offdiag(plt.scatter);

Try passing the remaining variables (var5-var9) to the pair grid. You should be able to see which of the variables in the dataset correlate most strongly with ‘leave’ vote percentage and whether the correlations are positive or negative.

4. Go into detail

Seaborn also provides a heatmap method that we can use to quickly compare the correlation coefficient of each pair of variables (the value between -1 and 1 that describes the strength of the relationship between them). We can pass all the columns we're interested in to the heatmap in one go, because heatmaps are easier to read than pair grids:



In [11]:

    
plt.figure(
    figsize=(15, 15),
    dpi=72
)

columns = [ # ALL THE COLUMNS
    'var1',
    'var2',
    'var3',
    'var4',
    'var5',
    'var6',
    'var7',
    'var8',
    'var9',
    'leave'
]

# Calculate the standard correlation coefficient of each pair of columns
correlations = df[columns].corr(method='pearson')

sns.heatmap(
    data=correlations,
    square=True,
    xticklabels=correlations.columns.values,
    yticklabels=correlations.columns.values,
    # The Matplotlib colormap to use
    # (https://matplotlib.org/examples/color/colormaps_reference.html)
    cmap='plasma'
)









    Out[11]:





<matplotlib.axes._subplots.AxesSubplot at 0x10bdc8a58>

By now, you should have a pretty good idea which variables are worth reporting as being significant demographic factors in the ‘leave’ vote. If you wanted to take your analysis even further, you could also report on whether London boroughs returned higher or lower ‘leave’ vote percentages than we would expect based on the values of any correlating variable. A convenient way to do this would be to use Seaborn's built-in linear regression plotting:



In [12]:

    
columns = ['var2', 'leave']

g = sns.lmplot(
    data=df,
    x=columns[0],
    y=columns[1],
    hue='is_london',
    palette=['#0571b0', '#ca0020'],
    size=7,
    fit_reg=False,
)

sns.regplot(
    data=df,
    x=columns[0],
    y=columns[1],
    scatter=False,
    color='#0571b0',
    ax=g.axes[0, 0]
)









    Out[12]:





<matplotlib.axes._subplots.AxesSubplot at 0x10bfed550>

Reading this plot, we're able to say that, all things being equal, most of the London boroughs have lower ‘leave’ vote percentages than we would expect based on their levels of var2 alone. This suggests—rightly—that variables other than var2 are in play in determining London's lower-than-expected levels of ‘leave’ voting.

5. Make a graphic and get it out of the notebook

Everyone knows that data journalism without pretty graphics is just boring. While the Matplotlib and Seaborn scatter plots get the job done, they're not exactly 😍 For that, we need Bokeh.

You can pretty much throw a stone and hit a data visualisation library these days, but Bokeh is a good fit for Jupyter notebooks because it's made for Python and can work with dataframes and all that other good stuff we've got going on in here. So let's fire it up by telling it that, like Matplotlib, we want it to plot in the notebook:



In [13]:

    
output_notebook()









    





    
        
        Loading BokehJS ...

Because we want this to be our output graphic, we're going to be much fussier about how it looks, so there's quite a bit of configuration involved here:



In [14]:

    
color_map = {False: '#0571b0', True: '#ca0020'}

# Instantiate our plot
p = figure(
    plot_width=600,
    plot_height=422,
    background_fill_color='#d3d3d3',
    title='Leave demographics'
)

# Add a circle renderer to the plot
p.circle(
    x=df['var2'],
    y=df['leave'],
    # Size the markers according to the size of the electorate (scaled down)
    size=df['electorate'] / 20000,
    fill_color=df['is_london'].map(color_map),
    line_color=df['is_london'].map(color_map),
    line_width=1,
    alpha=0.5
)

# Configure the plot's x axis
p.xaxis.axis_label = 'var5'
p.xgrid.grid_line_color = None

# Configure the plot's y axis
p.yaxis.axis_label = 'Percentage voting leave'
p.ygrid.grid_line_color = '#999999'
p.ygrid.grid_line_alpha = 1
p.ygrid.grid_line_dash = [6, 4]

# Show the plot
show(p)

Now that's starting to look like something we could publish. Refer to the Bokeh docs for more customisation options, and when you're happy with how your plot looks, click the ‘Save’ button on the toolbar at the right of the plot to save it to disk as a PNG image. If you want to, paste it into the hackpad with your name—coolest-looking one wins a drink!

4. Baller-level challenges

Using only pandas, identify the top five ‘leave’-voting areas in the country, both by vote percentage and per capita of the electorate
Using Jupyter interactors, add a checkbox that toggles highlighting of London boroughs in your Bokeh plot
Recreate your Bokeh plot in D3 (in the notebook). Hint

	ons_id	name	region_name	electorate	result	leave	turnout	votes_cast	var1	var2	var3	var4	var5	var6	var7	var8	var9
0	E06000001	Hartlepool	North East	70341	leave	69.6	65.6	46134	48.8	23.9	49.5	15.9	17.6	495.2	30.5	25.6	13.0
1	E06000002	Middlesbrough	North East	94612	leave	65.5	64.9	61393	48.5	25.0	66.1	42.6	18.5	467.6	64.9	22.6	13.3
2	E06000003	Redcar and Cleveland	North East	103529	leave	66.2	70.3	72741	16.8	22.3	25.0	64.0	18.9	505.5	18.6	23.8	13.2
3	E06000004	Stockton-on-Tees	North East	141486	leave	61.7	71.0	100460	24.4	25.2	58.3	17.2	22.8	502.5	45.3	20.5	15.9
4	E06000005	Darlington	North East	77662	leave	56.2	71.1	55195	23.8	24.6	55.6	50.5	23.7	484.0	29.9	21.7	15.2
5	E06000006	Halton	North West	95289	leave	57.4	68.3	65047	19.9	25.4	30.2	47.7	17.6	484.7	46.1	21.0	11.9
6	E06000007	Warrington	North West	157042	leave	54.3	73.4	115205	4.6	26.0	35.4	27.5	27.4	545.6	33.7	15.5	18.1
7	E06000008	Blackburn with Darwen	North West	100117	leave	56.3	65.3	65408	21.2	26.8	6.5	31.8	19.8	413.0	51.6	18.2	14.0
8	E06000009	Blackpool	North West	102354	leave	67.5	65.4	66959	43.9	23.3	45.9	15.2	15.5	392.4	46.6	24.4	10.3
9	E06000010	Hull	Yorkshire and The Humber	180230	leave	67.6	62.9	113436	60.9	28.2	7.7	31.0	15.2	445.9	31.4	26.9	10.5

	ons_id	name	region_name	electorate	result	leave	turnout	votes_cast	var1	var2	var3	var4	var5	var6	var7	var8	var9	is_london
293	E09000001	City of London	London	5987	remain	24.7	73.6	4405	65.7	37.2	37.4	25.2	68.4	864.7	27.6	3.2	39.9	True
294	E09000002	Barking and Dagenham	London	115812	leave	62.4	63.9	73941	38.6	31.2	34.5	18.2	20.9	534.5	43.2	16.0	13.3	True
295	E09000003	Barnet	London	223467	remain	37.8	72.1	161218	20.3	31.9	49.7	30.1	40.3	629.8	66.4	5.6	24.7	True
296	E09000004	Bexley	London	170779	leave	63.0	75.3	128570	5.5	26.5	29.8	56.1	21.8	612.7	31.6	12.3	14.7	True
297	E09000005	Brent	London	186793	remain	40.3	65.1	121671	35.8	34.3	50.3	12.9	33.3	551.4	30.8	5.1	18.2	True
298	E09000006	Bromley	London	231473	remain	49.4	78.9	182570	11.5	27.5	42.1	62.2	33.1	681.0	65.0	9.6	21.4	True
299	E09000007	Camden	London	145425	remain	25.1	65.5	95281	29.0	37.5	44.9	5.3	50.5	719.8	31.6	4.8	31.9	True
300	E09000008	Croydon	London	245349	remain	45.7	69.8	171289	13.8	29.7	37.7	12.2	31.8	602.8	12.4	10.4	19.6	True
301	E09000009	Ealing	London	212991	remain	39.6	70.1	149268	30.8	33.9	66.4	18.2	37.0	562.2	34.4	6.2	21.0	True
302	E09000010	Enfield	London	198387	remain	44.2	69.1	137056	14.5	29.9	47.3	68.2	28.7	564.0	10.6	8.6	19.0	True
303	E09000011	Greenwich	London	168967	remain	44.4	69.5	117470	60.6	35.0	23.2	56.6	33.2	601.0	62.0	12.1	20.4	True
304	E09000012	Hackney	London	163284	remain	21.5	65.2	106422	61.6	43.7	66.7	56.8	41.8	603.4	20.1	6.8	25.0	True
305	E09000013	Hammersmith and Fulham	London	114863	remain	30.0	70.0	80347	48.9	40.6	59.9	53.6	49.6	686.1	23.6	5.5	27.0	True
306	E09000014	Haringey	London	150098	remain	24.4	70.6	106032	30.1	38.6	18.2	36.8	40.8	576.6	17.2	6.2	23.9	True
307	E09000015	Harrow	London	162397	remain	45.4	72.3	117352	5.5	30.4	41.2	51.5	36.8	625.7	32.5	5.3	23.2	True
308	E09000016	Havering	London	183082	leave	69.7	76.0	139175	61.2	26.1	3.5	21.8	19.4	626.3	22.5	12.3	14.1	True
309	E09000017	Hillingdon	London	193033	leave	56.4	69.0	133170	27.6	30.6	46.9	20.7	28.0	605.5	43.6	9.5	17.1	True
310	E09000018	Hounslow	London	165050	remain	48.9	69.8	115208	40.9	35.1	13.7	40.6	34.6	565.7	24.9	8.2	18.5	True
311	E09000019	Islington	London	144514	remain	24.8	70.4	101723	52.1	41.7	9.6	56.8	48.1	670.8	37.2	7.1	31.3	True
312	E09000020	Kensington and Chelsea	London	83042	remain	31.3	66.0	54801	45.8	35.5	5.1	43.0	52.7	762.8	11.7	3.2	23.4	True
313	E09000021	Kingston-upon-Thames	London	108838	remain	38.4	78.4	85330	32.1	32.4	44.8	67.8	41.4	701.9	39.6	7.5	25.2	True
314	E09000022	Lambeth	London	210800	remain	21.4	67.4	142162	53.8	44.2	36.9	19.5	46.6	621.5	65.4	7.1	25.7	True
315	E09000023	Lewisham	London	197514	remain	30.1	63.1	124637	61.1	37.4	12.8	31.5	38.0	606.4	39.1	10.2	22.6	True
316	E09000024	Merton	London	136352	remain	37.1	73.5	100207	59.8	35.5	32.3	42.4	41.1	622.4	66.6	7.8	24.4	True
317	E09000025	Newham	London	176985	remain	47.2	59.3	104864	49.3	38.6	14.1	17.1	30.1	505.5	67.9	7.2	14.8	True
318	E09000026	Redbridge	London	189843	remain	46.0	67.6	128397	10.9	32.2	58.8	53.0	33.9	652.5	16.9	6.5	22.9	True
319	E09000027	Richmond-upon-Thames	London	132632	remain	30.7	82.1	108876	4.8	30.9	27.6	56.4	53.0	744.2	53.6	5.9	29.7	True
320	E09000028	Southwark	London	195875	remain	27.2	66.2	129677	34.1	40.3	9.3	24.7	43.1	644.2	23.0	8.0	25.8	True
321	E09000029	Sutton	London	140288	leave	53.7	76.0	106633	23.6	29.6	6.3	4.2	30.1	605.9	45.9	10.0	19.2	True
322	E09000030	Tower Hamlets	London	167820	remain	32.5	64.6	108420	44.6	45.3	30.3	18.3	41.0	637.0	16.9	6.6	25.7	True
323	E09000031	Waltham Forest	London	162983	remain	40.9	66.7	108689	56.9	35.5	49.0	65.6	30.0	546.1	53.6	8.7	17.9	True
324	E09000032	Wandsworth	London	219521	remain	25.0	72.0	158018	67.1	45.1	16.9	34.2	53.6	718.5	10.6	6.2	29.0	True
325	E09000033	Westminster	London	120524	remain	31.0	65.0	78325	15.0	40.0	45.3	64.8	50.3	764.4	29.0	3.6	26.6	True