A D Vishnu Prasad

Director of Cloud Engineering @ Team8Solutions, Freelancer

Elasticsearch Autocomplete Example

This post will help you to create autocomplete feature on Elasticsearch.

For Example, if you have a select box and when you search for data you would want to get all results that starts with data.

ES1

This can be done easily in Elasticsearch. The important thing here is the type of analyzer and tokenizer than you use for autocomplete.

Step 1: Create Mapping with the following tokenizer and analyzer

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
PUT auto-test
{
  "settings": {
    "analysis": {
      "analyzer": {
        "autocomplete": {
          "tokenizer": "autocomplete",
          "filter": [
            "lowercase"
          ]
        },
        "autocomplete_search": {
          "tokenizer": "lowercase"
        }
      },
      "tokenizer": {
        "autocomplete": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 10,
          "token_chars": [
            "letter"
          ]
        }
      }
    }
  },
  "mappings": {
    "test": {
      "properties": {
        "technology": {
          "type": "text",
          "analyzer": "autocomplete",
          "search_analyzer": "autocomplete_search"
        }
      }
    }
  }
}

autocomplete_search analyzer is used for searching case insensitive words.

Step 2: Test Analyzers

1
2
3
4
5
POST auto-test/_analyze
{
  "field": "technology",
  "text": "database"
}

If you see the results, ES is creating the following tokens. You can change the min_gram based on your needs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
{
  "tokens": [
    {
      "token": "da",
      "start_offset": 0,
      "end_offset": 2,
      "type": "word",
      "position": 0
    },
    {
      "token": "dat",
      "start_offset": 0,
      "end_offset": 3,
      "type": "word",
      "position": 1
    },
    {
      "token": "data",
      "start_offset": 0,
      "end_offset": 4,
      "type": "word",
      "position": 2
    },
    {
      "token": "datab",
      "start_offset": 0,
      "end_offset": 5,
      "type": "word",
      "position": 3
    },
    {
      "token": "databa",
      "start_offset": 0,
      "end_offset": 6,
      "type": "word",
      "position": 4
    },
    {
      "token": "databas",
      "start_offset": 0,
      "end_offset": 7,
      "type": "word",
      "position": 5
    },
    {
      "token": "database",
      "start_offset": 0,
      "end_offset": 8,
      "type": "word",
      "position": 6
    }
  ]
}

Step 3: Create data and search

1
2
3
4
5
6
7
8
9
POST auto-test/test
{
  "technology": "data"
}

POST auto-test/test
{
  "technology": "database"
}
1
2
3
4
5
6
7
8
GET auto-test/_search
{
  "query": {
    "match": {
      "technology": "dat"
    }
  }
}

The results should be

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
{
  "took": 2,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "auto-test",
        "_type": "test",
        "_id": "3woWB2YBsF1yfhimkgKz",
        "_score": 0.2876821,
        "_source": {
          "technology": "data"
        }
      },
      {
        "_index": "auto-test",
        "_type": "test",
        "_id": "xGsWB2YBVxVaheWYfgCc",
        "_score": 0.2876821,
        "_source": {
          "technology": "database"
        }
      }
    ]
  }
}