9

I am new to Elastic search and I am trying to create one demo of Completion suggester with whitespace Analyzer.

As per the documentation of Whitespace Analyzer, It breaks text into terms whenever it encounters a whitespace character. So my question is do it works with Completion suggester too?

So for my completion suggester prefix : "ela", I am expecting output as "Hello elastic search."

I know an easy solution for this is to add multi-field input as :

"suggest": {
         "input": ["Hello","elastic","search"]
 }

However, if this is the solution then what is meaning of using analyzer? Does analyzer make sense in completion suggester?

My mapping :

{
  "settings": {
    "analysis": {
      "analyzer": {
        "completion_analyzer": {
          "type": "custom",
          "filter": [
            "lowercase"
          ],
          "tokenizer": "whitespace"
        }
      }
    }
  },
  "mappings": {
            "my-type": {
                "properties": {
                    "mytext": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "suggest": {
                        "type": "completion",
                        "analyzer": "completion_analyzer",
                        "search_analyzer": "completion_analyzer",
                        "max_input_length": 50
                    }
                }
            }
        }
}

My document :

{
    "_index": "my-index",
    "_type": "my-type",
    "_id": "KTWJBGEBQk_Zl_sQdo9N",
    "_score": 1,
    "_source": {
        "mytext": "dummy text",
        "suggest": {
                 "input": "Hello elastic search."
        }
    }
}

Search request :

{
    "suggest": {
        "test-suggest" : {
        "prefix" :"ela", 
        "completion" : { 
            "field" : "suggest",
            "skip_duplicates": true
        }
        }
    }
}

This search is not returning me the correct output, but if I use prefix = 'hel' I am getting correct output : "Hello elastic search."

In brief I would like to know is whitespace Analyzer works with completion suggester? and if there is a way, can you please suggest me.

PS: I have already look for this links but I didn't find useful answer.

ElasticSearch completion suggester Standard Analyzer not working

What Elasticsearch Analyzer to use for this completion suggester?

I find this link useful Word-oriented completion suggester (ElasticSearch 5.x). However they have not use completion suggester.

Thanks in advance.

Jimmy

Jimmy
  • 1,289
  • 2
  • 16
  • 30
  • I'm having exactly the same issue with a custom analyzer, it seems completion doesn't take into account the output of the analyzer somehow – pcambra Mar 02 '18 at 03:08
  • 1
    yes, something is not correct with analyzer in completion suggester. Finally I end up using alternate solution to add multiple tags in input array instead expected approach. – Jimmy Mar 03 '18 at 05:27
  • What I'm exploring for my use case are nGrams instead. – pcambra Mar 04 '18 at 05:55
  • @pcambra oh okay. I tried above approach with edge-ngrams and it was working correctly. you can try this link https://stackoverflow.com/questions/41744712/word-oriented-completion-suggester-elasticsearch-5-x I implemented the procedure given in accepted answer and it was working well for my requirement. Only thing is we can't skip duplicate with ngram. – Jimmy Mar 05 '18 at 06:09

2 Answers2

5

The completion suggester cannot perform full-text queries, which means that it cannot return suggestions based on words in the middle of a multi-word field.

From ElasticSearch itself:

The reason is that an FST query is not the same as a full text query. We can't find words anywhere within a phrase. Instead, we have to start at the left of the graph and move towards the right.

As you discovered, the best alternative to the completion suggester that can match the middle of fields is an edge n-gram filter.

Simon Tower
  • 568
  • 2
  • 8
  • 21
1

gI know this question is ages old, but have you tried have multiple suggestions, one based on prefix and the next one based in regex ?

Something like

{
    "suggest": {
        "test-suggest-exact" : {
            "prefix" :"ela", 
            "completion" : { 
                "field" : "suggest",
                "skip_duplicates": true
            }
        },
        "test-suggest-regex" : {
            "regex" :".*ela.*", 
            "completion" : { 
                "field" : "suggest",
                "skip_duplicates": true
            }
        }
    }
}

Use results from the second suggest when the first one is empty. The good thing is that meaningful phrases are returned by the Elasticsearch suggest.

Shingle based approach, using a full query search and then aggregating based on search terms sometimes gives broken phrases which are contextually wrong. I can write more if you are interested.

karthikcru
  • 57
  • 7