Using cURL with Elasticsearch

The introductory documents and tutorials all use cURL (here after referred to by its command line name curl) to interact with Elasticsearch and demonstrate what is possible and what is returned. Below is a short collection of these exercises with some explainations.

Hello World!

This first example for elasticsearch is almost always a simple get with no parameters. It is a simple way to check to see if the environment and server are set and functioning properly. Hence, the reason for the title.

The examples are using an AWS instance, the user will need to change the server to either "localhost" for their personal machine or the URL for the elasticsearch server they are using.


In [36]:
%%bash
curl -XGET "http://search-01.ec2.internal:9200/"


{
  "status" : 200,
  "name" : "search-01",
  "cluster_name" : "protoglobe",
  "version" : {
    "number" : "1.4.4",
    "build_hash" : "c88f77ffc81301dfa9dfd81ca2232f09588bd512",
    "build_timestamp" : "2015-02-19T13:05:36Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.3"
  },
  "tagline" : "You Know, for Search"
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   334  100   334    0     0  93662      0 --:--:-- --:--:-- --:--:--  163k

Count

Counting is faster than searching and should be used when the actual results are not needed. From "ElasticSearch Cookbook - Second Edition":

It is often required to return only the count of the matched results and not the results themselves. The advantages of using a count request is the performance it offers and reduced resource usage, as a standard search call also returns hits count.

The simplest count is a count of all the documents in elasticsearch.


In [42]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/_count'


{"count":415755520,"_shards":{"total":214,"successful":214,"failed":0}}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    71  100    71    0     0     83      0 --:--:-- --:--:-- --:--:--    83

The second type of simple count is to count by index. If the index is gdelt1979 then:

Example 1

In [48]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gdelt1979/_count'


{"count":430941,"_shards":{"total":5,"successful":5,"failed":0}}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    64  100    64    0     0   9683      0 --:--:-- --:--:-- --:--:-- 12800

or if the index is the Global Summary of the Day data, i.e. gsod then:

Example 2

In [47]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_count'


{"count":125411455,"_shards":{"total":5,"successful":5,"failed":0}}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    67  100    67    0     0    108      0 --:--:-- --:--:-- --:--:--   108

If the user prefers a nicer looking output then a request to make it pretty is in order.


In [49]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_count?pretty'


{
  "count" : 125411455,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  }
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   103  100   103    0     0    270      0 --:--:-- --:--:-- --:--:--   271
Count Summary

Keep in mind counts can be as complicated as searches. Just changing _count to _search and vice versa changes how elasticsearch handles the request.

With that said it is now time to show and develop some search examples.

Search is the main use for elasticsearch, hence the name and where the bulk of the examples will be. This notebook will attempt to take the user through examples that show only one new feature at a time. This will hopefully allow the user to see the order of commands which is unfortuantely important to elasticsearch.

As with count above it will start with a simple example.


In [51]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_search'


{"took":464,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":125411455,"max_score":1.0,"hits":[{"_index":"gsod","_type":"observation","_id":"AUvBKEuvDzV4m8XpNo3S","_score":1.0,"_source":{"Wind Speed": "10.4", "Precipitation": "0.00I", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "16.9", "FRSHTT": "000000", "SLP": "1010.4", "Mean Temp": "45.3", "Dew Point": "40.9", "Max Temp": "55.4", "STP": "993.4", "Visibility": "14.8", "WBAN": "14714", "Date": "1948-04-28", "Station Id": "725038", "Num of Obs": "24", "Min Temp": "38.3"}},{"_index":"gsod","_type":"observation","_id":"AUvBKEulDzV4m8XpNo3K","_score":1.0,"_source":{"Wind Speed": "8.3", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "13.0", "FRSHTT": "000000", "SLP": "1011.1", "Mean Temp": "63.6", "Dew Point": "48.7", "Max Temp": "80.1", "STP": "1007.5", "Visibility": "999.9", "WBAN": "99999", "Date": "2008-04-24", "Station Id": "945680", "Num of Obs": "7", "Min Temp": "56.7"}},{"_index":"gsod","_type":"observation","_id":"AUvBKEvfDzV4m8XpNo3r","_score":1.0,"_source":{"Wind Speed": "7.1", "Precipitation": "0.00I", "Snow Depth": "999.9", "Gust": "19.0", "Max Wind Speed": "14.0", "FRSHTT": "000000", "SLP": "1008.2", "Mean Temp": "87.6", "Dew Point": "57.8", "Max Temp": "100.4", "STP": "9999.9", "Visibility": "10.0", "WBAN": "99999", "Date": "2006-08-04", "Station Id": "722868", "Num of Obs": "24", "Min Temp": "75.2"}},{"_index":"gsod","_type":"observation","_id":"AUvBKEwXDzV4m8XpNo4m","_score":1.0,"_source":{"Wind Speed": "4.0", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "7.0", "FRSHTT": "000000", "SLP": "1020.8", "Mean Temp": "52.2", "Dew Point": "33.1", "Max Temp": "71.6", "STP": "1017.2", "Visibility": "18.6", "WBAN": "99999", "Date": "2008-04-29", "Station Id": "945680", "Num of Obs": "8", "Min Temp": "33.8"}},{"_index":"gsod","_type":"observation","_id":"AUvBKEwfDzV4m8XpNo4x","_score":1.0,"_source":{"Wind Speed": "3.5", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "8.9", "FRSHTT": "000000", "SLP": "1022.0", "Mean Temp": "56.7", "Dew Point": "43.7", "Max Temp": "75.0", "STP": "1018.3", "Visibility": "16.3", "WBAN": "99999", "Date": "2008-04-30", "Station Id": "945680", "Num of Obs": "8", "Min Temp": "35.8"}},{"_index":"gsod","_type":"observation","_id":"AUvBKEwsDzV4m8XpNo5F","_score":1.0,"_source":{"Wind Speed": "5.2", "Precipitation": "0.00I", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "15.0", "FRSHTT": "000000", "SLP": "1012.2", "Mean Temp": "56.3", "Dew Point": "49.1", "Max Temp": "72.3", "STP": "995.3", "Visibility": "10.5", "WBAN": "14714", "Date": "1948-05-04", "Station Id": "725038", "Num of Obs": "24", "Min Temp": "41.4"}},{"_index":"gsod","_type":"observation","_id":"AUvBKEwzDzV4m8XpNo5T","_score":1.0,"_source":{"Wind Speed": "3.2", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "11.1", "FRSHTT": "000000", "SLP": "1017.0", "Mean Temp": "61.0", "Dew Point": "49.1", "Max Temp": "81.1", "STP": "1013.3", "Visibility": "999.9", "WBAN": "99999", "Date": "2008-05-03", "Station Id": "945680", "Num of Obs": "8", "Min Temp": "43.5"}},{"_index":"gsod","_type":"observation","_id":"AUvBKEw6DzV4m8XpNo5c","_score":1.0,"_source":{"Wind Speed": "11.4", "Precipitation": "99.99", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "21.0", "FRSHTT": "110000", "SLP": "1007.4", "Mean Temp": "49.7", "Dew Point": "46.3", "Max Temp": "61.3", "STP": "990.5", "Visibility": "5.5", "WBAN": "14714", "Date": "1948-05-07", "Station Id": "725038", "Num of Obs": "23", "Min Temp": "46.4"}},{"_index":"gsod","_type":"observation","_id":"AUvBKExJDzV4m8XpNo51","_score":1.0,"_source":{"Wind Speed": "2.4", "Precipitation": "0.00I", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "5.1", "FRSHTT": "000000", "SLP": "9999.9", "Mean Temp": "57.3", "Dew Point": "30.1", "Max Temp": "69.8", "STP": "9999.9", "Visibility": "18.2", "WBAN": "99999", "Date": "2005-12-29", "Station Id": "722749", "Num of Obs": "14", "Min Temp": "41.0"}},{"_index":"gsod","_type":"observation","_id":"AUvBKExNDzV4m8XpNo59","_score":1.0,"_source":{"Wind Speed": "2.2", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "6.0", "FRSHTT": "000000", "SLP": "1019.6", "Mean Temp": "58.6", "Dew Point": "46.2", "Max Temp": "76.8", "STP": "1015.9", "Visibility": "999.9", "WBAN": "99999", "Date": "2008-05-09", "Station Id": "945680", "Num of Obs": "8", "Min Temp": "39.0"}}]}}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  4592  100  4592    0     0   9791      0 --:--:-- --:--:-- --:--:--  9832

By default elasticsearch returns 10 documents for every search. As is evident the pretty option used for count above is needed here.


In [52]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_search?pretty'


{
  "took" : 432,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 125411455,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEuvDzV4m8XpNo3S",
      "_score" : 1.0,
      "_source":{"Wind Speed": "10.4", "Precipitation": "0.00I", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "16.9", "FRSHTT": "000000", "SLP": "1010.4", "Mean Temp": "45.3", "Dew Point": "40.9", "Max Temp": "55.4", "STP": "993.4", "Visibility": "14.8", "WBAN": "14714", "Date": "1948-04-28", "Station Id": "725038", "Num of Obs": "24", "Min Temp": "38.3"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEulDzV4m8XpNo3K",
      "_score" : 1.0,
      "_source":{"Wind Speed": "8.3", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "13.0", "FRSHTT": "000000", "SLP": "1011.1", "Mean Temp": "63.6", "Dew Point": "48.7", "Max Temp": "80.1", "STP": "1007.5", "Visibility": "999.9", "WBAN": "99999", "Date": "2008-04-24", "Station Id": "945680", "Num of Obs": "7", "Min Temp": "56.7"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEvfDzV4m8XpNo3r",
      "_score" : 1.0,
      "_source":{"Wind Speed": "7.1", "Precipitation": "0.00I", "Snow Depth": "999.9", "Gust": "19.0", "Max Wind Speed": "14.0", "FRSHTT": "000000", "SLP": "1008.2", "Mean Temp": "87.6", "Dew Point": "57.8", "Max Temp": "100.4", "STP": "9999.9", "Visibility": "10.0", "WBAN": "99999", "Date": "2006-08-04", "Station Id": "722868", "Num of Obs": "24", "Min Temp": "75.2"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEwXDzV4m8XpNo4m",
      "_score" : 1.0,
      "_source":{"Wind Speed": "4.0", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "7.0", "FRSHTT": "000000", "SLP": "1020.8", "Mean Temp": "52.2", "Dew Point": "33.1", "Max Temp": "71.6", "STP": "1017.2", "Visibility": "18.6", "WBAN": "99999", "Date": "2008-04-29", "Station Id": "945680", "Num of Obs": "8", "Min Temp": "33.8"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEwfDzV4m8XpNo4x",
      "_score" : 1.0,
      "_source":{"Wind Speed": "3.5", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "8.9", "FRSHTT": "000000", "SLP": "1022.0", "Mean Temp": "56.7", "Dew Point": "43.7", "Max Temp": "75.0", "STP": "1018.3", "Visibility": "16.3", "WBAN": "99999", "Date": "2008-04-30", "Station Id": "945680", "Num of Obs": "8", "Min Temp": "35.8"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEwsDzV4m8XpNo5F",
      "_score" : 1.0,
      "_source":{"Wind Speed": "5.2", "Precipitation": "0.00I", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "15.0", "FRSHTT": "000000", "SLP": "1012.2", "Mean Temp": "56.3", "Dew Point": "49.1", "Max Temp": "72.3", "STP": "995.3", "Visibility": "10.5", "WBAN": "14714", "Date": "1948-05-04", "Station Id": "725038", "Num of Obs": "24", "Min Temp": "41.4"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEwzDzV4m8XpNo5T",
      "_score" : 1.0,
      "_source":{"Wind Speed": "3.2", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "11.1", "FRSHTT": "000000", "SLP": "1017.0", "Mean Temp": "61.0", "Dew Point": "49.1", "Max Temp": "81.1", "STP": "1013.3", "Visibility": "999.9", "WBAN": "99999", "Date": "2008-05-03", "Station Id": "945680", "Num of Obs": "8", "Min Temp": "43.5"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEw6DzV4m8XpNo5c",
      "_score" : 1.0,
      "_source":{"Wind Speed": "11.4", "Precipitation": "99.99", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "21.0", "FRSHTT": "110000", "SLP": "1007.4", "Mean Temp": "49.7", "Dew Point": "46.3", "Max Temp": "61.3", "STP": "990.5", "Visibility": "5.5", "WBAN": "14714", "Date": "1948-05-07", "Station Id": "725038", "Num of Obs": "23", "Min Temp": "46.4"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKExJDzV4m8XpNo51",
      "_score" : 1.0,
      "_source":{"Wind Speed": "2.4", "Precipitation": "0.00I", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "5.1", "FRSHTT": "000000", "SLP": "9999.9", "Mean Temp": "57.3", "Dew Point": "30.1", "Max Temp": "69.8", "STP": "9999.9", "Visibility": "18.2", "WBAN": "99999", "Date": "2005-12-29", "Station Id": "722749", "Num of Obs": "14", "Min Temp": "41.0"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKExNDzV4m8XpNo59",
      "_score" : 1.0,
      "_source":{"Wind Speed": "2.2", "Precipitation": "0.00G", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "6.0", "FRSHTT": "000000", "SLP": "1019.6", "Mean Temp": "58.6", "Dew Point": "46.2", "Max Temp": "76.8", "STP": "1015.9", "Visibility": "999.9", "WBAN": "99999", "Date": "2008-05-09", "Station Id": "945680", "Num of Obs": "8", "Min Temp": "39.0"}
    } ]
  }
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  5153  100  5153    0     0  11820      0 --:--:-- --:--:-- --:--:-- 11873

Much better but it can be easily seen that if this notebook continues with the elasticsearch default for number of documents it will become very unweldy very quickly. So, let's use the size option.


In [54]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_search?pretty' -d '
{
    "size": "1"
}'


{
  "took" : 429,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 125411455,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEuvDzV4m8XpNo3S",
      "_score" : 1.0,
      "_source":{"Wind Speed": "10.4", "Precipitation": "0.00I", "Snow Depth": "999.9", "Gust": "999.9", "Max Wind Speed": "16.9", "FRSHTT": "000000", "SLP": "1010.4", "Mean Temp": "45.3", "Dew Point": "40.9", "Max Temp": "55.4", "STP": "993.4", "Visibility": "14.8", "WBAN": "14714", "Date": "1948-04-28", "Station Id": "725038", "Num of Obs": "24", "Min Temp": "38.3"}
    } ]
  }
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   717  100   697  100    20   1607     46 --:--:-- --:--:-- --:--:--  1609

In [64]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_search?pretty' -d '
{
    "_source": ["Max Temp"],
    "size": "2"
}'


{
  "took" : 459,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 125411455,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEuvDzV4m8XpNo3S",
      "_score" : 1.0,
      "_source":{"Max Temp":"55.4"}
    }, {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBKEulDzV4m8XpNo3K",
      "_score" : 1.0,
      "_source":{"Max Temp":"80.1"}
    } ]
  }
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   572  100   523  100    49   1116    104 --:--:-- --:--:-- --:--:--  1122

In [2]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_search?pretty' -d '
{
    "query": {
        "filtered": {
            "filter": {
                "range": {
                    "Date": {
                        "gte": "2007-01-01",
                        "lte": "2007-01-01" 
                    }
                }
            }
        }
    },
    "_source": ["Max Temp"],
    "size": "1"
}'


{
  "took" : 13,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "failed" : 0
  },
  "hits" : {
    "total" : 17598,
    "max_score" : 1.0,
    "hits" : [ {
      "_index" : "gsod",
      "_type" : "observation",
      "_id" : "AUvBIX6YDzV4m8XpLVsY",
      "_score" : 1.0,
      "_source":{"Max Temp":"75.2"}
    } ]
  }
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   685  100   357  100   328  21669  19908 --:--:-- --:--:-- --:--:-- 23800

In [ ]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_search?pretty' -d '
{
    "query": {
        "filtered": {
            "query": { "match_all": {} },
            "filter": {
                "range": {
                    "Date": {
                        "gte": "2007-01-01",
                        "lte": "2007-12-31" 
                    }
                }
            }
        }
    },
    "size": "1"
}'

In [1]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_count' -d '
{
    "query": {
        "filtered": {
            "filter": {
                "range": {
                    "Date": {
                        "gte": "2007-01-01",
                        "lte": "2007-01-31" 
                    }
                }
            }
        }
    }
}'


{"count":563280,"_shards":{"total":5,"successful":5,"failed":0}}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   346  100    64  100   282     13     58  0:00:04  0:00:04 --:--:--     0

In [2]:
%%bash
curl -XGET 'http://search-01.ec2.internal:9200/gsod/_search?pretty' -d '
{
    "query": {
        "filtered": {
            "query": { "match_all": {} },
            "filter": {
                "range": {
                    "Date": {
                        "gte": "2007-01-01",
                        "lte": "2007-01-31" 
                    }
                }
            }
        }
    },
    "_source": ["Mean Temp", "Min Temp", "Max Temp"],
    "size": "563280"
}' > temps_200701.txt


Process is terminated.

In [26]:
import json

with open("temps_2007.txt", "r") as f:
    mean_temps = []
    max_temps = []
    min_temps = []
   
    for line in f:
        if "_source" in line:
            line = json.loads(line[16:-1])
            
            min_tmp = float(line['Min Temp'])
            if -300 < min_tmp < 300:
                min_temps.append(min_tmp)
            
            mean_tmp = float(line['Mean Temp'])
            if -300 < min_tmp < 300:
                mean_temps.append(mean_tmp)
                
            max_tmp = float(line['Max Temp'])
            if -300 < max_tmp < 300:
                max_temps.append(max_tmp)

print("From {} observations the temperatures for 2007 are:"\
          .format(len(mean_temps)))
print("Min Temp:  {:.1f}".format(min(min_temps)))
print("Mean Temp: {:.1f}".format(sum(mean_temps)/len(mean_temps)))
print("Max Temp:  {:.1f}".format(max(max_temps)))


Min Temp:  -114.3
Mean Temp: 54.6
Max Temp:  129.2

In [ ]: