10

How can i use django pagination on elasticsearch dsl. My code:

query = MultiMatch(query=q, fields=['title', 'body'], fuzziness='AUTO')

s = Search(using=elastic_client, index='post').query(query).sort('-created_at')
response = s.execute()

// this always returns page count 1
paginator = Paginator(response, 100)
page = request.GET.get('page')
try:
    posts = paginator.page(page)
except PageNotAnInteger:
    posts = paginator.page(1)
except EmptyPage:
    posts = paginator.page(paginator.num_pages)

Any solution for this?

Mirza Delic
  • 3,591
  • 12
  • 46
  • 79

4 Answers4

14

I found this paginator on this link:

from django.core.paginator import Paginator, Page

class DSEPaginator(Paginator):
    """
    Override Django's built-in Paginator class to take in a count/total number of items;
    Elasticsearch provides the total as a part of the query results, so we can minimize hits.
    """
    def __init__(self, *args, **kwargs):
        super(DSEPaginator, self).__init__(*args, **kwargs)
        self._count = self.object_list.hits.total

    def page(self, number):
        # this is overridden to prevent any slicing of the object_list - Elasticsearch has
        # returned the sliced data already.
        number = self.validate_number(number)
        return Page(self.object_list, number, self)

and then in view i use:

    q = request.GET.get('q', None)
    page = int(request.GET.get('page', '1'))
    start = (page-1) * 10
    end = start + 10

    query = MultiMatch(query=q, fields=['title', 'body'], fuzziness='AUTO')
    s = Search(using=elastic_client, index='post').query(query)[start:end]
    response = s.execute()

    paginator = DSEPaginator(response, settings.POSTS_PER_PAGE)
    try:
        posts = paginator.page(page)
    except PageNotAnInteger:
        posts = paginator.page(1)
    except EmptyPage:
        posts = paginator.page(paginator.num_pages)

this way it works perfectly..

castis
  • 7,838
  • 4
  • 40
  • 60
Mirza Delic
  • 3,591
  • 12
  • 46
  • 79
  • 1
    the `count` property in this example just shows the number of items in the page not the total. you can override the `count` cached_property of the paginator to return `_count` as the total count – Nasir Feb 09 '17 at 16:07
1

Following the advice from Danielle Madeley, I also created a proxy to search results which works well with the latest version of django-elasticsearch-dsl==0.4.4.

from django.utils.functional import LazyObject

class SearchResults(LazyObject):
    def __init__(self, search_object):
        self._wrapped = search_object

    def __len__(self):
        return self._wrapped.count()

    def __getitem__(self, index):
        search_results = self._wrapped[index]
        if isinstance(index, slice):
            search_results = list(search_results)
        return search_results

Then you can use it in your search view like this:

paginate_by = 20
search = MyModelDocument.search()
# ... do some filtering ...
search_results = SearchResults(search)

paginator = Paginator(search_results, paginate_by)
page_number = request.GET.get("page")
try:
    page = paginator.page(page_number)
except PageNotAnInteger:
    # If page parameter is not an integer, show first page.
    page = paginator.page(1)
except EmptyPage:
    # If page parameter is out of range, show last existing page.
    page = paginator.page(paginator.num_pages)

Django's LazyObject proxies all attributes and methods from the object assigned to the _wrapped attribute. I am overriding a couple of methods that are required by Django's paginator, but don't work out of the box with the Search() instances.

Aidas Bendoraitis
  • 3,875
  • 1
  • 30
  • 43
1

A very simple solution is to use MultipleObjectMixin and extract your Elastic results in get_queryset() by overriding it. In this case Django will take care of the pagination itself if you add the paginate_by attribute.

It should look like that:

class MyView(MultipleObjectMixin, ListView):
    paginate_by = 10

    def get_queryset(self):
        object_list = []
        """ Query Elastic here and return the response data in `object_list`.
            If you wish to add filters when querying Elastic,
            you can use self.request.GET params here. """
        return object_list

Note: The code above is broad and different from my own case so I can not guarantee it works. I used similar solution by inheriting other Mixins, overriding get_queryset() and taking advantage of Django's built in pagination - it worked great for me. As it was an easy fix I decided to post it here with a similar example.

Nikolay Shindarov
  • 1,051
  • 2
  • 9
  • 20
  • It also works great with DRF without any additional code. Just return `Search` object in the `get_queryset`, that's all – stefanitsky Feb 01 '21 at 15:22
0

Another way forward is to create a proxy between the Paginator and the Elasticsearch query. Paginator requires two things, __len__ (or count) and __getitem__ (that takes a slice). A rough version of the proxy works like this:

class ResultsProxy(object):
    """
    A proxy object for returning Elasticsearch results that is able to be
    passed to a Paginator.
    """

    def __init__(self, es, index=None, body=None):
        self.es = es
        self.index = index
        self.body = body

    def __len__(self):
        result = self.es.count(index=self.index,
                               body=self.body)
        return result['count']

    def __getitem__(self, item):
        assert isinstance(item, slice)

        results = self.es.search(
            index=self.index,
            body=self.body,
            from_=item.start,
            size=item.stop - item.start,
        )

        return results['hits']['hits']

A proxy instance can be passed to Paginator and will make requests to ES as needed.

Danielle Madeley
  • 1,818
  • 15
  • 23