Searching datasets

erddapy can wrap the same form-like search capabilities of ERDDAP with the search_for keyword.



In [1]:

    
def show_iframe(src):
    from IPython.display import HTML
    iframe = '<iframe src="{src}" width="100%" height="950"></iframe>'.format
    return HTML(iframe(src=src))


def to_df(url):
    import pandas as pd
    return pd.read_csv(url)



In [2]:

    
from erddapy import ERDDAP


e = ERDDAP(
    server="https://upwell.pfeg.noaa.gov/erddap",
    protocol="tabledap"
)

Single word search.



In [3]:

    
search_for = "fukushima"

url = e.get_search_url(search_for=search_for, response="csv")

to_df(url)["Dataset ID"]









    Out[3]:





0    northerngulfinstitute_edac_dap3_0a94_4f88_8950
1    northerngulfinstitute_edac_dap3_0bc3_0230_8add
2    northerngulfinstitute_edac_dap3_2689_8c24_7dcb
3                               whoi_7a97_cb6f_a9db
4                               whoi_4a75_e5e1_6640
5              northerngulfinstitute_1412_d11d_1e9b
6              northerngulfinstitute_a8f3_c2d4_2227
Name: Dataset ID, dtype: object

Filtering the search with extra words.



In [4]:

    
search_for = "fukushima velocity"

url = e.get_search_url(search_for=search_for, response="csv")

to_df(url)["Dataset ID"]









    Out[4]:





0    northerngulfinstitute_edac_dap3_0a94_4f88_8950
1                               whoi_7a97_cb6f_a9db
2              northerngulfinstitute_a8f3_c2d4_2227
Name: Dataset ID, dtype: object

Filtering the search with words that should not be found.



In [5]:

    
search_for = "fukushima -velocity"

url = e.get_search_url(search_for=search_for, response="csv")

to_df(url)["Dataset ID"]









    Out[5]:





0    northerngulfinstitute_edac_dap3_0bc3_0230_8add
1    northerngulfinstitute_edac_dap3_2689_8c24_7dcb
2                               whoi_4a75_e5e1_6640
3              northerngulfinstitute_1412_d11d_1e9b
Name: Dataset ID, dtype: object

Quoted search or "phrase search," first let us try the unquoted search.



In [6]:

    
search_for = "wind speed"

url = e.get_search_url(search_for=search_for, response="csv")

len(to_df(url)["Dataset ID"])









    Out[6]:





600

Too many datasets because wind, speed, and wind speed are matched. Now let's use the quoted search to reduce the number of results to only wind speed.



In [7]:

    
search_for = '"wind speed"'

url = e.get_search_url(search_for=search_for, response="csv")

len(to_df(url)["Dataset ID"])









    Out[7]:





569

This example is written in a Jupyter Notebook click here to download the notebook so you can run it locally, or click here to run a live instance of this notebook.