4

We are using Elassandra (Elastic Search and Cassandra) and querying on Elastic search index using presto. When we get count of index using _count API then it comes correct each time, but when we query on index to find out count using presto then it varies as shown below :

select count(*) from elasticsearch.my_schema.idx

Index mapping is shown as below :

{
  "idx": {
    "mappings": {
      "my_table": {
        "properties": {
          "col1": {
            "type": "keyword",
            "cql_collection": "singleton",
            "cql_partition_key": true,
            "cql_primary_key_order": 0
          },
          "col2": {
            "type": "keyword",
            "cql_collection": "singleton"
          }
        }
      }
    }
  }
}

Presto configuration : 1) elasticsearch.properties

connector.name=elasticsearch
elasticsearch.table-description-directory=etc/elasticsearch/
elasticsearch.scroll-size=1000
elasticsearch.scroll-timeout=30s
#elasticsearch.request-timeout=2s
elasticsearch.max-request-retries=10
elasticsearch.max-request-retry-time=90s
elasticsearch.max-hits=200000000

2) etc/elasticsearch/table.json

{
  "tableName": "my_table",
  "schemaName": "my_schema",
  "host": "10.XXX.XXX.XXX",
  "port": "9300",
  "clusterName": "my cluster",
  "index": "idx",
  "type": "my_table",
  "columns": [
      {
          "name": "col1",
          "type": "varchar",
          "jsonPath": "col1",
          "jsonType": "keyword"
      },
      {
          "name": "col2",
          "type": "varchar",
          "jsonPath": "col2",
          "jsonType": "keyword"
      }
  ]
}

We are have 3 node elassandra cluster (3 node elastic search cluster). Shards 3 and Replication 1 for this particular index. Versions :

Persto - 0.218
Elassandra - 6.2.3.21
Kibana - 6.2.3

Kindly help.

  • 2
    Can you try with Presto 330? There are many improvements in the Elasticsearch connector from the version you are running. – Martin Traverso Feb 23 '20 at 16:39
  • We used Presto 330 and 328 versions but we faced same issue with that also. – Mananpreet Singh Feb 23 '20 at 16:41
  • Are you updating your indexes as you run those queries? Do the counts eventually reflect the correct number? – Martin Traverso Feb 23 '20 at 17:31
  • I have not done any write operation for many days and index is refreshed. Elastic search count API on index is giving consistent and correct result but presto query on same index is varying. Do I need to add some property in my table json or elasticsearch.properties file which I am missing ? – Mananpreet Singh Feb 24 '20 at 02:36
  • That shouldn’t happen, but it will be easier to troubleshoot this issue on the Presto Slack: https://prestosql.io/slack.html – Martin Traverso Feb 25 '20 at 03:48

0 Answers0