Title: Split Combined Lat/Long Coordinate Variables Into Seperate Variables In Pandas
Slug: pandas_split_lat_and_long_into_variables
Summary: Split Combined Lat/Long Coordinate Variables Into Seperate Variables In Pandas
Date: 2016-05-01 12:00
Category: Python
Tags: Data Wrangling
Authors: Chris Albon

Preliminaries


In [1]:
import pandas as pd
import numpy as np

Create an example dataframe


In [2]:
raw_data = {'geo': ['40.0024, -105.4102', '40.0068, -105.266', '39.9318, -105.2813', np.nan]}
df = pd.DataFrame(raw_data, columns = ['geo'])
df


Out[2]:
geo
0 40.0024, -105.4102
1 40.0068, -105.266
2 39.9318, -105.2813
3 NaN

Split the geo variable into seperate lat and lon variables


In [3]:
# Create two lists for the loop results to be placed
lat = []
lon = []

# For each row in a varible,
for row in df['geo']:
    # Try to,
    try:
        # Split the row by comma and append
        # everything before the comma to lat
        lat.append(row.split(',')[0])
        # Split the row by comma and append
        # everything after the comma to lon
        lon.append(row.split(',')[1])
    # But if you get an error
    except:
        # append a missing value to lat
        lat.append(np.NaN)
        # append a missing value to lon
        lon.append(np.NaN)

# Create two new columns from lat and lon
df['latitude'] = lat
df['longitude'] = lon

View the dataframe


In [4]:
df


Out[4]:
geo latitude longitude
0 40.0024, -105.4102 40.0024 -105.4102
1 40.0068, -105.266 40.0068 -105.266
2 39.9318, -105.2813 39.9318 -105.2813
3 NaN NaN NaN