Location Data Services

Introduction

CARTOframes provides the functionality to use the CARTO Data Services API. This API consists of a set of location-based functions that can be applied to your data to perform geospatial analyses without leaving the context of your notebook.

For instance, you can geocode a pandas DataFrame with addresses on the fly, and then perform a trade areas analysis by computing isodistances or isochrones programmatically.

Given a set of ten simulated Starbucks store addresses, this guide walks through the use case of finding good location candidates to open an additional store.

Based on your account plan, some of these location data services are subject to different quota limitations

Data

This guide uses the same dataset of simulated Starbucks locations that has been used in the other guides and can be downloaded here.

Authentication

Using Location Data Services requires to be authenticated. For more information about how to authenticate, please read the Login to CARTO Platform guide



In [1]:

    
from cartoframes.auth import Credentials, set_default_credentials

set_default_credentials('creds.json')

Geocoding

To get started, let's read in and explore the Starbucks location data we have. With the Starbucks store data in a DataFrame, we can see that there are two columns that can be used in the geocoding service: name and address. There's also a third column that reflects the annual revenue of the store.



In [2]:

    
import pandas as pd

df = pd.read_csv('http://libs.cartocdn.com/cartoframes/files/starbucks_brooklyn.csv')
df









    Out[2]:







  
    
      
      name
      address
      revenue
    
  
  
    
      0
      Franklin Ave & Eastern Pkwy
      341 Eastern Pkwy,Brooklyn, NY 11238
      1.321041e+06
    
    
      1
      607 Brighton Beach Ave
      607 Brighton Beach Avenue,Brooklyn, NY 11235
      1.268080e+06
    
    
      2
      65th St & 18th Ave
      6423 18th Avenue,Brooklyn, NY 11204
      1.248134e+06
    
    
      3
      Bay Ridge Pkwy & 3rd Ave
      7419 3rd Avenue,Brooklyn, NY 11209
      1.185703e+06
    
    
      4
      Caesar's Bay Shopping Center
      8973 Bay Parkway,Brooklyn, NY 11214
      1.148427e+06
    
    
      5
      Court St & Dean St
      167 Court Street,Brooklyn, NY 11201
      1.144067e+06
    
    
      6
      Target Gateway T-1401
      519 Gateway Dr,Brooklyn, NY 11239
      1.021083e+06
    
    
      7
      3rd Ave & 92nd St
      9202 Third Avenue,Brooklyn, NY 11209
      9.257073e+05
    
    
      8
      Lam Group @ Sheraton Brooklyn
      228 Duffield st,Brooklyn, NY 11201
      7.657935e+05
    
    
      9
      33-42 Hillel Place
      33-42 Hillel Place,Brooklyn, NY 11210
      7.492163e+05

Quota consumption

Each time you run Location Data Services, you consume quota. For this reason, we provide the ability to check in advance the amount of credits an operation will consume by using the dry_run parameter when running the service function.

It is also possible to check the available quota by running the available_quota function.



In [3]:

    
from cartoframes.data.services import Geocoding

geo_service = Geocoding()

_, geo_dry_metadata = geo_service.geocode(
    df,
    street='address',
    city={'value': 'New York'},
    country={'value': 'USA'},
    dry_run=True
)



In [4]:

    
geo_dry_metadata









    Out[4]:





{'total_rows': 10,
 'required_quota': 10,
 'previously_geocoded': 0,
 'previously_failed': 0,
 'records_with_geometry': 0}



In [5]:

    
geo_service.available_quota()









    Out[5]:





4977588



In [6]:

    
geo_gdf, geo_metadata = geo_service.geocode(
    df,
    street='address',
    city={'value': 'New York'},
    country={'value': 'USA'}
)









    



Success! Data geocoded correctly

If the input data file should ever change, cached results will only be applied to unmodified records, and new geocoding will be performed only on new or changed records.

In order to use cached results, we have to save the results to a CARTO table using the table_name and cached=True parameters.



In [7]:

    
geo_gdf_cached, geo_metadata_cached = geo_service.geocode(
    df,
    street='address',
    city={'value': 'New York'},
    country={'value': 'USA'},
    table_name='starbucks_cache',
    cached=True
)









    



Success! Data geocoded correctly

Let's compare geo_dry_metadata and geo_metadata to see the differences between the information returned with and without the dry_run option. As we can see, this information reflects that all the locations have been geocoded successfully and that it has consumed 10 credits of quota.



In [8]:

    
geo_metadata









    Out[8]:





{'total_rows': 10,
 'required_quota': 10,
 'previously_geocoded': 0,
 'previously_failed': 0,
 'records_with_geometry': 0,
 'final_records_with_geometry': 10,
 'geocoded_increment': 10,
 'successfully_geocoded': 10,
 'failed_geocodings': 0}

The resulting data is a GeoDataFrame that contains three new columns:

geometry: The resulting geometry
gc_status_rel: The percentage of accuracy of each location
carto_geocode_hash: Geocode information



In [9]:

    
geo_gdf.head()









    Out[9]:







  
    
      
      the_geom
      name
      address
      revenue
      gc_status_rel
      carto_geocode_hash
    
  
  
    
      0
      POINT (-73.95746 40.67102)
      Franklin Ave & Eastern Pkwy
      341 Eastern Pkwy,Brooklyn, NY 11238
      1321040.772
      0.97
      c834a8e289e5bce280775a9bf1f833f1
    
    
      1
      POINT (-73.96122 40.57796)
      607 Brighton Beach Ave
      607 Brighton Beach Avenue,Brooklyn, NY 11235
      1268080.418
      0.99
      7d39a3fff93efd9034da88aa9ad2da79
    
    
      2
      POINT (-73.98978 40.61944)
      65th St & 18th Ave
      6423 18th Avenue,Brooklyn, NY 11204
      1248133.699
      0.98
      1a2312049ddea753ba42bf77f5ccf718
    
    
      3
      POINT (-74.02750 40.63202)
      Bay Ridge Pkwy & 3rd Ave
      7419 3rd Avenue,Brooklyn, NY 11209
      1185702.676
      0.98
      827ab4dcc2d49d5fd830749597976d4a
    
    
      4
      POINT (-74.00098 40.59321)
      Caesar's Bay Shopping Center
      8973 Bay Parkway,Brooklyn, NY 11214
      1148427.411
      0.98
      119a38c7b51195cd4153fc81605a8495

In addition, to prevent geocoding records that have been previously geocoded, and thus spend quota unnecessarily, you should always preserve the the_geom and carto_geocode_hash columns generated by the geocoding process.

This will happen automatically in these cases:

Your input is a table from CARTO processed in place (without a table_name parameter)
If you save your results to a CARTO table using the table_name parameter, and only use the resulting table for any further geocoding.

If you try to geocode this DataFrame now, that contains both the_geom and the carto_geocode_hash, you will see that the required quota is 0 because it has already been geocoded.



In [10]:

    
_, repeat_geo_metadata = geo_service.geocode(
    geo_gdf,
    street='address',
    city={'value': 'New York'},
    country={'value': 'USA'},
    dry_run=True
)



In [11]:

    
repeat_geo_metadata.get('required_quota')









    Out[11]:





0

Precision

The address column is more complete than the name column, and therefore, the resulting coordinates calculated by the service will be more accurate. If we check this, the accuracy values using the name column (0.95, 0.93, 0.96, 0.83, 0.78, 0.9) are lower than the ones we get by using the address column for geocoding (0.97, 0.99, 0.98).



In [12]:

    
geo_name_gdf, geo_name_metadata = geo_service.geocode(
    df,
    street='name',
    city={'value': 'New York'},
    country={'value': 'USA'}
)









    



Success! Data geocoded correctly



In [13]:

    
geo_name_gdf.head()









    Out[13]:







  
    
      
      the_geom
      name
      address
      revenue
      gc_status_rel
      carto_geocode_hash
    
  
  
    
      0
      POINT (-73.95795 40.67071)
      Franklin Ave & Eastern Pkwy
      341 Eastern Pkwy,Brooklyn, NY 11238
      1321040.772
      0.93
      0be7693fc688eca36e1077656dcb00a5
    
    
      1
      POINT (-73.96122 40.57796)
      607 Brighton Beach Ave
      607 Brighton Beach Avenue,Brooklyn, NY 11235
      1268080.418
      0.96
      084a5c4d42ccf3c3c8e69426619f270e
    
    
      2
      POINT (-73.99018 40.61914)
      65th St & 18th Ave
      6423 18th Avenue,Brooklyn, NY 11204
      1248133.699
      0.93
      1d9a17c20c11d0454aff10548a328c47
    
    
      3
      POINT (-74.02778 40.63146)
      Bay Ridge Pkwy & 3rd Ave
      7419 3rd Avenue,Brooklyn, NY 11209
      1185702.676
      0.96
      d531df27fc02336dc722cb4e7028b244
    
    
      4
      POINT (-75.29322 43.07849)
      Caesar's Bay Shopping Center
      8973 Bay Parkway,Brooklyn, NY 11214
      1148427.411
      0.85
      9d8c13b5b4a93591f427d3ce0b5b4ead



In [14]:

    
geo_name_gdf.gc_status_rel.unique()









    Out[14]:





array([0.93, 0.96, 0.85, 0.83, 0.74, 0.87])



In [15]:

    
geo_gdf.head()









    Out[15]:







  
    
      
      the_geom
      name
      address
      revenue
      gc_status_rel
      carto_geocode_hash
    
  
  
    
      0
      POINT (-73.95746 40.67102)
      Franklin Ave & Eastern Pkwy
      341 Eastern Pkwy,Brooklyn, NY 11238
      1321040.772
      0.97
      c834a8e289e5bce280775a9bf1f833f1
    
    
      1
      POINT (-73.96122 40.57796)
      607 Brighton Beach Ave
      607 Brighton Beach Avenue,Brooklyn, NY 11235
      1268080.418
      0.99
      7d39a3fff93efd9034da88aa9ad2da79
    
    
      2
      POINT (-73.98978 40.61944)
      65th St & 18th Ave
      6423 18th Avenue,Brooklyn, NY 11204
      1248133.699
      0.98
      1a2312049ddea753ba42bf77f5ccf718
    
    
      3
      POINT (-74.02750 40.63202)
      Bay Ridge Pkwy & 3rd Ave
      7419 3rd Avenue,Brooklyn, NY 11209
      1185702.676
      0.98
      827ab4dcc2d49d5fd830749597976d4a
    
    
      4
      POINT (-74.00098 40.59321)
      Caesar's Bay Shopping Center
      8973 Bay Parkway,Brooklyn, NY 11214
      1148427.411
      0.98
      119a38c7b51195cd4153fc81605a8495

Visualize the results

Finally, we can visualize the precision of the geocoded results using a CARTOframes visualization layer.



In [16]:

    
from cartoframes.viz import Layer, color_bins_style, popup_element

Layer(
    geo_gdf,
    color_bins_style(
        'gc_status_rel',
        method='equal',
        bins=geo_gdf.gc_status_rel.unique().size,
    ),
    popup_hover=[
        popup_element('address', 'Address'),
        popup_element('gc_status_rel', 'Precision')
    ],
    title='Geocoding Precision'
)

Isolines

There are two Isoline functions: isochrones and isodistances. In this guide we will use the isochrones function to calculate walking areas by time for each Starbucks store and the isodistances function to calculate the walking area by distance.

By definition, isolines are concentric polygons that display equally calculated levels over a given surface area, and they are calculated as the intersection areas from the origin point, measured by:

Time in the case of isochrones
Distance in the case of isodistances

Isochrones

For isochrones, let's calculate the time ranges of: 5, 15 and 30 min. These ranges are input in seconds, so they will be 300, 900, and 1800 respectively.



In [17]:

    
from cartoframes.data.services import Isolines

iso_service = Isolines()

_, isochrones_dry_metadata = iso_service.isochrones(geo_gdf, [300, 900, 1800], mode='walk', dry_run=True)

Remember to always check the quota using dry_run parameter and available_quota method before running the service!



In [18]:

    
print('available {0}, required {1}'.format(
    iso_service.available_quota(),
    isochrones_dry_metadata.get('required_quota'))
)









    



available 112699, required 30



In [19]:

    
isochrones_gdf, isochrones_metadata = iso_service.isochrones(geo_gdf, [300, 900, 1800], mode='walk')









    



Success! Isolines created correctly



In [20]:

    
isochrones_gdf.head()









    Out[20]:







  
    
      
      source_id
      data_range
      the_geom
    
  
  
    
      0
      9
      1800
      MULTIPOLYGON (((-73.96485 40.63379, -73.96460 ...
    
    
      1
      2
      1800
      MULTIPOLYGON (((-74.00605 40.62899, -74.00579 ...
    
    
      2
      6
      1800
      MULTIPOLYGON (((-73.88520 40.65371, -73.88494 ...
    
    
      3
      5
      1800
      MULTIPOLYGON (((-74.00674 40.68598, -74.00648 ...
    
    
      4
      4
      1800
      MULTIPOLYGON (((-74.01412 40.60341, -74.01369 ...



In [21]:

    
from cartoframes.viz import basic_style

Layer(isochrones_gdf, basic_style(opacity=0.5))

Isodistances

For isodistances, let's calculate the distance ranges of: 100, 500 and 1000 meters. These ranges are input in meters, so they will be 100, 500, and 1000 respectively.



In [22]:

    
isodistances_gdf, isodistances_dry_metadata = iso_service.isodistances(
    geo_gdf,
    [100, 500, 1000],
    mode='walk',
    dry_run=True
)



In [23]:

    
print('available {0}, required {1}'.format(
    iso_service.available_quota(),
    isodistances_dry_metadata.get('required_quota'))
)









    



available 112669, required 30



In [24]:

    
isodistances_gdf, isodistances_metadata = iso_service.isodistances(
    geo_gdf,
    [100, 500, 1000],
    mode='walk'
)









    



Success! Isolines created correctly



In [25]:

    
isodistances_gdf.head()









    Out[25]:







  
    
      
      source_id
      data_range
      the_geom
    
  
  
    
      0
      9
      1000
      MULTIPOLYGON (((-73.95867 40.63311, -73.95842 ...
    
    
      1
      2
      1000
      MULTIPOLYGON (((-73.99850 40.62281, -73.99841 ...
    
    
      2
      6
      1000
      MULTIPOLYGON (((-73.87696 40.65371, -73.87671 ...
    
    
      3
      5
      1000
      MULTIPOLYGON (((-74.00245 40.69061, -74.00185 ...
    
    
      4
      4
      1000
      MULTIPOLYGON (((-74.00451 40.59860, -74.00391 ...



In [26]:

    
Layer(isodistances_gdf, basic_style(opacity=0.5))

All together

Let's visualize the data in one map to see what insights we can find.



In [27]:

    
from cartoframes.viz import Map, Layer, size_continuous_style

Map([
    Layer(
        isochrones_gdf,
        basic_style(opacity=0.5),
        title='Walking Time'
    ),
    Layer(
        geo_gdf,
        size_continuous_style(
            'revenue',
            color='white',
            opacity='0.2',
            stroke_color='blue',
            size_range=[20, 80],
        ),
        popup_hover=[
            popup_element('address', 'Address'),
            popup_element('gc_status_rel', 'Precision'),
            popup_element('revenue', 'Revenue')
        ],
        title='Revenue $',
    )
])

Looking at the map above, we can see the store at 228 Duffield St, Brooklyn, NY 11201 is really close to another store with higher revenue, which means we could even think about closing that one in favor of another one with a better location.

We could try to calculate where to place a new store between other stores that don't have as much revenue as others and that are placed separately.

Now, let's calculate the centroid of three different stores that we've identified previously and use it as a possible location for a new spot:



In [28]:

    
from shapely import geometry

new_store_location = [
    geo_gdf.iloc[6].the_geom,
    geo_gdf.iloc[9].the_geom,
    geo_gdf.iloc[1].the_geom
]

# Create a polygon using three points from the geo_gdf
polygon = geometry.Polygon([[p.x, p.y] for p in new_store_location])



In [29]:

    
from geopandas import GeoDataFrame, points_from_xy

new_store_gdf = GeoDataFrame({
    'name': ['New Store'],
    'geometry': points_from_xy([polygon.centroid.x], [polygon.centroid.y])
})
    
isochrones_new_gdf, isochrones_new_metadata = iso_service.isochrones(new_store_gdf, [300, 900, 1800], mode='walk')









    



Success! Isolines created correctly



In [30]:

    
from cartoframes.viz import Map, Layer, size_continuous_style

Map([
    Layer(
        isochrones_gdf,
        basic_style(opacity=0.2),
        title='Walking Time - Current'
    ),
    Layer(
        isochrones_new_gdf,
        basic_style(opacity=0.7),
        title='Walking Time - New'
    ),
    Layer(
        geo_gdf,
        size_continuous_style(
            'revenue',
            color='white',
            opacity='0.2',
            stroke_color='blue',
            size_range=[20, 80]
        ),
        popup_hover=[
            popup_element('address', 'Address'),
            popup_element('gc_status_rel', 'Precision'),
            popup_element('revenue', 'Revenue')
        ],
        title='Revenue $',
    ),
    Layer(new_store_gdf)
])

Conclusion

In this example you've seen how to use Location Data Services to perform a trade area analysis using CARTOframes built-in functionality without leaving the notebook.

Using the results, we've calculated a possible new location for a store, and used the isoline areas to help in the decision making process.

Take into account that finding optimal spots for new stores is not an easy task and requires more analysis, but this is a great first step!

	name	address	revenue
0	Franklin Ave & Eastern Pkwy	341 Eastern Pkwy,Brooklyn, NY 11238	1.321041e+06
1	607 Brighton Beach Ave	607 Brighton Beach Avenue,Brooklyn, NY 11235	1.268080e+06
2	65th St & 18th Ave	6423 18th Avenue,Brooklyn, NY 11204	1.248134e+06
3	Bay Ridge Pkwy & 3rd Ave	7419 3rd Avenue,Brooklyn, NY 11209	1.185703e+06
4	Caesar's Bay Shopping Center	8973 Bay Parkway,Brooklyn, NY 11214	1.148427e+06
5	Court St & Dean St	167 Court Street,Brooklyn, NY 11201	1.144067e+06
6	Target Gateway T-1401	519 Gateway Dr,Brooklyn, NY 11239	1.021083e+06
7	3rd Ave & 92nd St	9202 Third Avenue,Brooklyn, NY 11209	9.257073e+05
8	Lam Group @ Sheraton Brooklyn	228 Duffield st,Brooklyn, NY 11201	7.657935e+05
9	33-42 Hillel Place	33-42 Hillel Place,Brooklyn, NY 11210	7.492163e+05

	the_geom	name	address	revenue	gc_status_rel	carto_geocode_hash
0	POINT (-73.95746 40.67102)	Franklin Ave & Eastern Pkwy	341 Eastern Pkwy,Brooklyn, NY 11238	1321040.772	0.97	c834a8e289e5bce280775a9bf1f833f1
1	POINT (-73.96122 40.57796)	607 Brighton Beach Ave	607 Brighton Beach Avenue,Brooklyn, NY 11235	1268080.418	0.99	7d39a3fff93efd9034da88aa9ad2da79
2	POINT (-73.98978 40.61944)	65th St & 18th Ave	6423 18th Avenue,Brooklyn, NY 11204	1248133.699	0.98	1a2312049ddea753ba42bf77f5ccf718
3	POINT (-74.02750 40.63202)	Bay Ridge Pkwy & 3rd Ave	7419 3rd Avenue,Brooklyn, NY 11209	1185702.676	0.98	827ab4dcc2d49d5fd830749597976d4a
4	POINT (-74.00098 40.59321)	Caesar's Bay Shopping Center	8973 Bay Parkway,Brooklyn, NY 11214	1148427.411	0.98	119a38c7b51195cd4153fc81605a8495

	the_geom	name	address	revenue	gc_status_rel	carto_geocode_hash
0	POINT (-73.95795 40.67071)	Franklin Ave & Eastern Pkwy	341 Eastern Pkwy,Brooklyn, NY 11238	1321040.772	0.93	0be7693fc688eca36e1077656dcb00a5
1	POINT (-73.96122 40.57796)	607 Brighton Beach Ave	607 Brighton Beach Avenue,Brooklyn, NY 11235	1268080.418	0.96	084a5c4d42ccf3c3c8e69426619f270e
2	POINT (-73.99018 40.61914)	65th St & 18th Ave	6423 18th Avenue,Brooklyn, NY 11204	1248133.699	0.93	1d9a17c20c11d0454aff10548a328c47
3	POINT (-74.02778 40.63146)	Bay Ridge Pkwy & 3rd Ave	7419 3rd Avenue,Brooklyn, NY 11209	1185702.676	0.96	d531df27fc02336dc722cb4e7028b244
4	POINT (-75.29322 43.07849)	Caesar's Bay Shopping Center	8973 Bay Parkway,Brooklyn, NY 11214	1148427.411	0.85	9d8c13b5b4a93591f427d3ce0b5b4ead

	source_id	data_range	the_geom
0	9	1800	MULTIPOLYGON (((-73.96485 40.63379, -73.96460 ...
1	2	1800	MULTIPOLYGON (((-74.00605 40.62899, -74.00579 ...
2	6	1800	MULTIPOLYGON (((-73.88520 40.65371, -73.88494 ...
3	5	1800	MULTIPOLYGON (((-74.00674 40.68598, -74.00648 ...
4	4	1800	MULTIPOLYGON (((-74.01412 40.60341, -74.01369 ...

	source_id	data_range	the_geom
0	9	1000	MULTIPOLYGON (((-73.95867 40.63311, -73.95842 ...
1	2	1000	MULTIPOLYGON (((-73.99850 40.62281, -73.99841 ...
2	6	1000	MULTIPOLYGON (((-73.87696 40.65371, -73.87671 ...
3	5	1000	MULTIPOLYGON (((-74.00245 40.69061, -74.00185 ...
4	4	1000	MULTIPOLYGON (((-74.00451 40.59860, -74.00391 ...