A Tutorial from Definitive Guide

Installation

  • pip install elasticsearch
  • ./bin/elasticsearch to start running

Testing the installation

To test from a separate terminal window:

curl 'http://localhost:9200/?pretty'

or test from within the notebook:


In [7]:
%%bash
curl 'http://localhost:9200/?pretty'


{
  "status" : 200,
  "name" : "White Tiger",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "1.4.4",
    "build_hash" : "c88f77ffc81301dfa9dfd81ca2232f09588bd512",
    "build_timestamp" : "2015-02-19T13:05:36Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.3"
  },
  "tagline" : "You Know, for Search"
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   339  100   339    0     0  26756      0 --:--:-- --:--:-- --:--:-- 28250

In [9]:
%%bash
curl -XGET 'localhost:9200/_count?pretty' -d '
{
    "query": {
        "match_all": {}
    }
}'


{
  "count" : 9,
  "_shards" : {
    "total" : 21,
    "successful" : 21,
    "failed" : 0
  }
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   146  100    97  100    49   2727   1377 --:--:-- --:--:-- --:--:--  2771

Tutorial


In [1]:
from datetime import datetime

In [2]:
from elasticsearch import Elasticsearch

In [3]:
# by default we connect to localhost:9200
es = Elasticsearch()

In [4]:
# datetimes will be serialized
es.index(index="my-index", doc_type="test-type", id=42, body={"any": "data", "timestamp": datetime.now()})


Out[4]:
{'_index': 'my-index',
 'created': False,
 '_id': '42',
 '_version': 6,
 '_type': 'test-type'}

In [5]:
# but not deserialized
es.get(index="my-index", doc_type="test-type", id=42)['_source']


Out[5]:
{'timestamp': '2015-03-10T11:02:10.519324', 'any': 'data'}

Example from Elasticsearch Docs


In [6]:
doc = {
    'author': 'kimchy',
    'text': 'Elasticsearch: cool. bonsai cool.',
    'timestamp': datetime(2010, 10, 10, 10, 10, 10)
}

In [7]:
res = es.index(index="test-index", doc_type='tweet', id=1, body=doc)
print(res['created'])


False

In [8]:
res = es.get(index="test-index", doc_type='tweet', id=1)
print(res['_source'])


{'text': 'Elasticsearch: cool. bonsai cool.', 'timestamp': '2010-10-10T10:10:10', 'author': 'kimchy'}

In [9]:
es.indices.refresh(index="test-index")


Out[9]:
{'_shards': {'successful': 5, 'total': 10, 'failed': 0}}

In [10]:
res = es.search(index="test-index", body={"query": {"match_all": {}}})
print("Got %d Hits:" % res['hits']['total'])


Got 1 Hits:

In [11]:
for hit in res['hits']['hits']:
    print("%(timestamp)s %(author)s: %(text)s" % hit["_source"])


2010-10-10T10:10:10 kimchy: Elasticsearch: cool. bonsai cool.

In [23]:
# Notice increased version
es.index(index="my-index", doc_type="test-type", id=42, body={"any": "data", "timestamp": datetime.now()})


Out[23]:
{'created': False,
 '_type': 'test-type',
 '_version': 5,
 '_id': '42',
 '_index': 'my-index'}

In [12]:
es.get(index="my-index", doc_type="test-type", id=42)['_source']


Out[12]:
{'timestamp': '2015-03-10T11:02:10.519324', 'any': 'data'}

In [3]:
%%bash
curl 'http://localhost:9200/?pretty'


{
  "status" : 200,
  "name" : "White Tiger",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "1.4.4",
    "build_hash" : "c88f77ffc81301dfa9dfd81ca2232f09588bd512",
    "build_timestamp" : "2015-02-19T13:05:36Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.3"
  },
  "tagline" : "You Know, for Search"
}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   339  100   339    0     0  32242      0 --:--:-- --:--:-- --:--:-- 33900

In [ ]: