215

I am currently trying to migrate a solr-based application to elasticsearch.

I have this lucene query

(( 
    name:(+foo +bar) 
    OR info:(+foo +bar) 
)) AND state:(1) AND (has_image:(0) OR has_image:(1)^100)

As far as I understand this is a combination of MUST clauses combined with boolean OR:

"Get all documents containing (foo AND bar in name) OR (foo AND bar in info). After that filter results by condition state=1 and boost documents that have an image."

I have been trying to use a bool query with MUST but I am failing to get boolean OR into must clauses. Here is what I have:

GET /test/object/_search
{
  "from": 0,
  "size": 20,
  "sort": {
    "_score": "desc"
  },
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "foo"
          }
        },
        {
          "match": {
            "name": "bar"
          }
        }
      ],
      "must_not": [],
      "should": [
        {
          "match": {
            "has_image": {
              "query": 1,
              "boost": 100
            }
          }
        }
      ]
    }
  }
}

As you can see, MUST conditions for "info" are missing.

Does anyone have a solution?

Thank you so much.

** UPDATE **

I have updated my elasticsearch query and got rid of that function score. My base problem still exists.

Jesse
  • 3,163
  • 2
  • 14
  • 16
  • 2
    There is a good documentation on combining ElasticSearch queries here: https://www.elastic.co/guide/en/elasticsearch/guide/current/combining-filters.html – Mr.Coffee Feb 26 '18 at 12:13
  • As of v7.10, here's the new documentation on boolean queries: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html I do believe that documentation could be improved with examples to make it much more clear about simple cases like this OR question... – maganap Nov 26 '20 at 11:44

6 Answers6

560
  • OR is spelled should
  • AND is spelled must
  • NOR is spelled should_not

Example:

You want to see all the items that are (round AND (red OR blue)):

{
    "query": {
        "bool": {
            "must": [
                {
                    "term": {"shape": "round"}
                },
                {
                    "bool": {
                        "should": [
                            {"term": {"color": "red"}},
                            {"term": {"color": "blue"}}
                        ]
                    }
                }
            ]
        }
    }
}

You can also do more complex versions of OR, for example if you want to match at least 3 out of 5, you can specify 5 options under "should" and set a "minimum_should" of 3.

Thanks to Glen Thompson and Sebastialonso for finding where my nesting wasn't quite right before.

Thanks also to Fatmajk for pointing out that "term" becomes "match" in ElasticSearch 6.

Louis
  • 1,409
  • 2
  • 20
  • 33
Daniel Fackrell
  • 5,851
  • 2
  • 8
  • 8
  • 1
    I believe this should be the accepted answer. This seems to be the simplest solution to the issue. Thank you. – Leland Cope Dec 08 '16 at 01:58
  • 2
    Would pulling the `should` into the upper-level `bool`, and including a `minimum_should_match: 1` work? – Sid Jan 30 '17 at 19:10
  • @Sid It would not. In that case, we would only require {"shape": "round"}, and the "should" would apply ordering to the results that had a shape of round, because at that level it's optional. We can only make it required by nesting under the "must". – Daniel Fackrell Feb 27 '17 at 22:33
  • 2
    @DanielFackrell please, fix missing curly braces for `should` values – arhak Apr 17 '17 at 03:18
  • Thanks for the catch, @arhak! – Daniel Fackrell Apr 18 '17 at 15:49
  • 18
    When I try this example I get back `[term] malformed query, expected [END_OBJECT] but found [FIELD_NAME]`. Is this somehow version dependent? – DanneJ Apr 20 '17 at 13:14
  • 32
    Why don't they add such a simple example and explanation in the docs! The example from the documentation is very confusing. – Nikhil Owalekar Nov 08 '17 at 18:39
  • 35
    After 6 months, reading all Elastic documentation, this is the first time I completely understand how to implement boolean logic. Official documentation lacks clarity in my opinion. – Sebastialonso May 17 '18 at 20:31
  • 2
    @DanneJ Apparently, you need to include the "shape":"round" element in a single object, and the inner "bool" part in a sibling object. That works, but it seems it's not the same query. – Sebastialonso May 17 '18 at 22:54
  • 3
    @Amir What inaccuracies can I clean up for you? In the context shown above, the default `minimum_should` is 1, and wrapping that in `bool` results in that group being true if at least one item matches, false if none match. My motivation for creating this answer was that I was solving exactly this kind of problem, and the available documentation and even the answers I could find on sites like this was unhelpful at best, so I kept researching until I felt I had a pretty solid grasp of what was going on. I gladly welcome any constructive pointers on how I can improve the answer further. – Daniel Fackrell Oct 14 '18 at 23:26
  • Excellent. I finally got it ;-) – JasonGenX Mar 22 '19 at 18:26
  • 1
    thanks @DanielFackrell, you save my day! Embed should query in must query is the way to go!!! – Eric Tan Sep 25 '20 at 05:30
  • Why this strange name for elasticsearch... – ch271828n Dec 22 '20 at 11:24
78

I finally managed to create a query that does exactly what i wanted to have:

A filtered nested boolean query. I am not sure why this is not documented. Maybe someone here can tell me?

Here is the query:

GET /test/object/_search
{
  "from": 0,
  "size": 20,
  "sort": {
    "_score": "desc"
  },
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "state": 1
              }
            }
          ]
        }
      },
      "query": {
        "bool": {
          "should": [
            {
              "bool": {
                "must": [
                  {
                    "match": {
                      "name": "foo"
                    }
                  },
                  {
                    "match": {
                      "name": "bar"
                    }
                  }
                ],
                "should": [
                  {
                    "match": {
                      "has_image": {
                        "query": 1,
                        "boost": 100
                      }
                    }
                  }
                ]
              }
            },
            {
              "bool": {
                "must": [
                  {
                    "match": {
                      "info": "foo"
                    }
                  },
                  {
                    "match": {
                      "info": "bar"
                    }
                  }
                ],
                "should": [
                  {
                    "match": {
                      "has_image": {
                        "query": 1,
                        "boost": 100
                      }
                    }
                  }
                ]
              }
            }
          ],
          "minimum_should_match": 1
        }
      }    
    }
  }
}

In pseudo-SQL:

SELECT * FROM /test/object
WHERE 
    ((name=foo AND name=bar) OR (info=foo AND info=bar))
AND state=1

Please keep in mind that it depends on your document field analysis and mappings how name=foo is internally handled. This can vary from a fuzzy to strict behavior.

"minimum_should_match": 1 says, that at least one of the should statements must be true.

This statements means that whenever there is a document in the resultset that contains has_image:1 it is boosted by factor 100. This changes result ordering.

"should": [
  {
    "match": {
      "has_image": {
        "query": 1,
        "boost": 100
      }
    }
   }
 ]

Have fun guys :)

Jesse
  • 3,163
  • 2
  • 14
  • 16
  • 33
    Holy crap. Does anyone have a better solution? Thanks for posting this, but that is absolutely way too much complexity to achieve a Logical OR in a query. – nackjicholson Sep 07 '16 at 07:07
  • thnx, you saved my day ) – cubbiu Oct 28 '16 at 16:48
  • 3
    Not only is this query unneccesarily long, its using deprecated syntax. @daniel-fackrell answer should be the accepted one. – Eric Alford Feb 22 '17 at 00:09
  • 5
    @EricAlford This answer from 2015 is based on a earlier version of ES. Feel free to provide a better solution. – Jesse Feb 23 '17 at 08:07
  • 1
    Idea: Take over / fork ElasticSearch, rewrite it in a user-friendly way, add simple query language to it, WIN! We just need funding. I'm in! Who else ? – Sliq Sep 30 '19 at 22:36
  • @Sliq what about this - https://www.elastic.co/what-is/elasticsearch-sql – Damien Roche Oct 18 '20 at 20:09
32

This is how you can nest multiple bool queries in one outer bool query this using Kibana,

  • bool indicates we are using boolean
  • must is for AND
  • should is for OR
GET my_inedx/my_type/_search
{
  "query" : {
     "bool": {             //bool indicates we are using boolean operator
          "must" : [       //must is for **AND**
               {
                 "match" : {
                       "description" : "some text"  
                   }
               },
               {
                  "match" :{
                        "type" : "some Type"
                   }
               },
               {
                  "bool" : {          //here its a nested boolean query
                        "should" : [  //should is for **OR**
                               {
                                 "match" : {
                                     //ur query
                                }
                               },
                               { 
                                  "match" : {} 
                               }     
                             ]
                        }
               }
           ]
      }
  }
}

This is how you can nest a query in ES


There are more types in "bool" like,

  1. Filter
  2. must_not
sushanth
  • 6,960
  • 3
  • 13
  • 23
niranjan harpale
  • 1,387
  • 1
  • 13
  • 19
  • 1
    Your answer is exactly right, But it's bit clumsy, it's a small suggestion for you if you like - you have to edit it properly. Probably it gives you more like on this answer :) Have a nice day. – Dhwanil Patel Apr 30 '20 at 05:58
7

I recently had to solve this problem too, and after a LOT of trial and error I came up with this (in PHP, but maps directly to the DSL):

'query' => [
    'bool' => [
        'should' => [
            ['prefix' => ['name_first' => $query]],
            ['prefix' => ['name_last' => $query]],
            ['prefix' => ['phone' => $query]],
            ['prefix' => ['email' => $query]],
            [
                'multi_match' => [
                    'query' => $query,
                    'type' => 'cross_fields',
                    'operator' => 'and',
                    'fields' => ['name_first', 'name_last']
                ]
            ]
        ],
        'minimum_should_match' => 1,
        'filter' => [
            ['term' => ['state' => 'active']],
            ['term' => ['company_id' => $companyId]]
        ]
    ]
]

Which maps to something like this in SQL:

SELECT * from <index> 
WHERE (
    name_first LIKE '<query>%' OR
    name_last LIKE '<query>%' OR
    phone LIKE  '<query>%' OR
    email LIKE '<query>%'
)
AND state = 'active'
AND company_id = <query>

The key in all this is the minimum_should_match setting. Without this the filter totally overrides the should.

Hope this helps someone!

Benjamin Dowson
  • 534
  • 5
  • 13
5

If you were using Solr's default or Lucene query parser, you can pretty much always put it into a query string query:

POST test/_search
{
  "query": {
    "query_string": {
      "query": "(( name:(+foo +bar) OR info:(+foo +bar)  )) AND state:(1) AND (has_image:(0) OR has_image:(1)^100)"
    }
  }
}

That said, you may want to use a boolean query, like the one you already posted, or even a combination of the two.

Radu Gheorghe
  • 464
  • 4
  • 4
0
$filterQuery = $this->queryFactory->create(QueryInterface::TYPE_BOOL, ['must' => $queries,'should'=>$queriesGeo]);

In must you need to add the query condition array which you want to work with AND and in should you need to add the query condition which you want to work with OR.

You can check this: https://github.com/Smile-SA/elasticsuite/issues/972

Sebastian D'Agostino
  • 1,279
  • 2
  • 24
  • 35