9

I have an index on AWS Elasticsearch which were unassighed due to NODE_LEFT. Here's an output of _cat/shards

rawindex-2017.07.04                     1 p STARTED    
rawindex-2017.07.04                     3 p UNASSIGNED NODE_LEFT
rawindex-2017.07.04                     2 p STARTED    
rawindex-2017.07.04                     4 p STARTED    
rawindex-2017.07.04                     0 p STARTED    

under normal circumstances, it would be easy to reassign these shards by using the _cluster or _settings. However, these are the exact APIs that are not allowed by AWS. I get the following message:

{
    Message: "Your request: '/_settings' is not allowed."
}

According to an answer to a very similar question, I can change the setting of an index using _index API, which is allowed by AWS. However, it seems like index.routing.allocation.disable_allocation is not valid for Elasticsearch 5.x, which I am running. I get the following error:

{
    "error": {
        "root_cause": [
            {
                "type": "remote_transport_exception",
                "reason": "[enweggf][x.x.x.x:9300][indices:admin/settings/update]"
            }
        ],
        "type": "illegal_argument_exception",
        "reason": "unknown setting [index.routing.allocation.disable_allocation] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
    },
    "status": 400
}

I tried prioritizing index recovery with high index.priority as well as setting index.unassigned.node_left.delayed_timeout to 1 minute, but I am just not being able to reassign them.

Is there any way (dirty or elegant) to achieve this on AWS managed ES?

Thanks!

Souradeep
  • 271
  • 1
  • 5
  • 15
  • With AWS ES and its limited flexibility, one way I would fix this, if there is already a backup of this index, is to just delete the index and restore it from backup. All shards will be allocated. – ben5556 Nov 15 '18 at 19:03

2 Answers2

16

I had a similar issue with AWS Elasticsearch version 6.3, namely 2 shards failed to be assigned, and the cluster had status RED. Running GET _cluster/allocation/explain showed that the reason was that they had exceeded the default maximum allocation retries of 5.

Running the query GET <my-index-name>/_settings revealed the few settings that can be changed per index. Note that all queries are in Kibana format which you have out of the box if you are using AWS Elasticsearch service. The following solved my problem:

PUT <my-index-name>/_settings
{
  "index.allocation.max_retries": 6
}

Running GET _cluster/allocation/explain immediately afterwards returned an error with the following: "reason": "unable to find any unassigned shards to explain...", and after some time the problem was resolved.

Matija Han
  • 452
  • 5
  • 7
  • Outstanding. Fixed my problem. One note: you can use the same `GET _cluster/allocation/explain` call to determine which index/indices contain the unassigned nodes in question. – sigpwned Mar 27 '20 at 15:12
  • Is there a way to set this globally? – villasv Jul 22 '20 at 19:58
  • Haven't tried this one specifically, but you can use the pseudo-index `_all` to change a setting for all indexes, i.e. `PUT _all/_settings` – Peter Halverson Oct 22 '20 at 15:35
1

There might be an alternative solution when the other solutions fail. If you have a managed Elasticsearch Instance on AWS the chances are high that you can "just" restore a snapshot.

Check for failed indexes.

You can use for e.g.:

curl -X GET "https://<es-endpoint>/_cat/shards"

or

curl -X GET "https://<es-endpoint>/_cluster/allocation/explain"

Check for snapshots.

To find snapshot repositories execute the following query:

curl -X GET "https://<es-endpoint>/_snapshot?pretty"

Next let's have a look at all the snapshots in the cs-automated repository:

curl -X GET "https://<es-endpoint>/_snapshot/cs-automated/_all?pretty"

Find a snapshot where failures: [ ] is empty or the index you want to restore is NOT in a failed state. Then delete the index you want to restore:

curl -XDELETE 'https://<es-endpoint>/<index-name>'

... and restore the deleted index like this:

curl -XPOST 'https://<es-endpoint>/_snapshot/cs-automated/<snapshot-name>/_restore' -d '{"indices": "<index-name>"}' -H 'Content-Type: application/json'

There is also some good documentation here:

Julian Pieles
  • 3,450
  • 2
  • 20
  • 30