52

my cluster is with yellow status because some shards are unassigned. what to do with this?

I tried set cluster.routing.allocation.disable_allocation = false to all indexes, but I think this don't work because I'm using version 1.1.1.

I also tried restarting all machines, but same happens.

Any idea?

EDIT :

  • Cluster stat :

    { 
      cluster_name: "elasticsearch",
      status: "red",
      timed_out: false,
      number_of_nodes: 5,
      number_of_data_nodes: 4,
      active_primary_shards: 4689,
      active_shards: 4689,
      relocating_shards: 0,
      initializing_shards: 10,
      unassigned_shards: 758
    }
    
eliasah
  • 35,948
  • 8
  • 110
  • 142
user3175226
  • 3,159
  • 5
  • 26
  • 41

7 Answers7

116

There are many possible reason why allocation won't occur:

  1. You are running different versions of Elasticsearch on different nodes
  2. You only have one node in your cluster, but you have number of replicas set to something other than zero.
  3. You have insufficient disk space.
  4. You have shard allocation disabled.
  5. You have a firewall or SELinux enabled. With SELinux enabled but not configured properly, you will see shards stuck in INITIALIZING or RELOCATING forever.

As a general rule, you can troubleshoot things like this:

  1. Look at the nodes in your cluster: curl -s 'localhost:9200/_cat/nodes?v'. If you only have one node, you need to set number_of_replicas to 0. (See ES documentation or other answers).
  2. Look at the disk space available in your cluster: curl -s 'localhost:9200/_cat/allocation?v'
  3. Check cluster settings: curl 'http://localhost:9200/_cluster/settings?pretty' and look for cluster.routing settings
  4. Look at which shards are UNASSIGNED curl -s localhost:9200/_cat/shards?v | grep UNASS
  5. Try to force a shard to be assigned

    curl -XPOST -d '{ "commands" : [ {
      "allocate" : {
           "index" : ".marvel-2014.05.21", 
           "shard" : 0, 
           "node" : "SOME_NODE_HERE",
           "allow_primary":true 
         } 
      } ] }' http://localhost:9200/_cluster/reroute?pretty
    
  6. Look at the response and see what it says. There will be a bunch of YES's that are ok, and then a NO. If there aren't any NO's, it's likely a firewall/SELinux problem.

nbari
  • 20,729
  • 5
  • 51
  • 86
Alcanzar
  • 16,297
  • 6
  • 40
  • 57
  • 4
    This is great, thanks - I found my issue this way. Turns out that one of my nodes' Elasticsearch version was a little behind the others, so the cluster refused to replicate those shards to it. Oh, and the difference was small - 1.4.2 vs 1.4.4. – KJH Mar 13 '15 at 15:48
  • 1
    This tells you exactly which index is unassigned. Sometimes it is indexes that you thought were deleted! Seems like a bug in ES but this at least lets you identify the exact reason why its unassigned!!! THANKS – hubbardr Apr 22 '15 at 14:54
  • 1
    Thanks for this! I was having a hell of a time figuring out why my shards were not being allocated after I added a new node to the cluster-- the new node was slightly newer than the old nodes. – pkaeding Jul 24 '15 at 17:21
  • 1
    Thank you so much! The debugging flow is priceless. – Matteo Melani Feb 11 '16 at 17:05
  • Thanks for the hint! If you have only one node, you for sure don´t need replicas... – jonashackt Sep 23 '16 at 12:51
  • Thank you! It's really helpful! – kikulikov Oct 25 '16 at 11:38
  • That's very helpful. In my case it was lack of disk space in one of the nodes. Thanks a lot! – Roberto Dec 27 '17 at 14:37
  • How are you running curl command. I am new to elastic search. Can I run curl command using browser? – Richi Sharma Nov 29 '18 at 17:14
52

This is a common issue arising from the default index setting, in particularly, when you try to replicate on a single node. To fix this with transient cluster setting, do this:

curl -XPUT http://localhost:9200/_settings -d '{ "number_of_replicas" :0 }'

Next, enable the cluster to reallocate shards (you can always turn this on after all is said and done):

curl -XPUT http://localhost:9200/_cluster/settings -d '
{
    "transient" : {
        "cluster.routing.allocation.enable": true
    }
}'

Now sit back and watch the cluster clean up the unassigned replica shards. If you want this to take effect with future indices, don't forget to modify elasticsearch.yml file with the following setting and bounce the cluster:

index.number_of_replicas: 0
ubershmekel
  • 9,570
  • 6
  • 62
  • 78
Philip M.
  • 1,089
  • 9
  • 7
  • 2
    this worked for me. windows command for reference is: curl -XPUT http://localhost:9200/_settings -d "{ """number_of_replicas""" :0 }" curl -XPUT http://localhost:9200/_cluster/settings -d "{ """transient""" : { """cluster.routing.allocation.enable""": true }}" – gigi Oct 19 '15 at 13:59
  • `true` isn't a valid value for `cluster.routing.allocation.enable` (this will throw a `java.lang.IllegalArgumentException: Illegal allocation.enable value [TRUE]`). Valid values are `all`, `primaries`, `new_primaries` or `none` (source: https://www.elastic.co/guide/en/elasticsearch/reference/2.4/shards-allocation.html#_shard_allocation_settings) – Bastien Libersa Jan 21 '19 at 09:40
20

Those unassigned shards are actually unassigned replicas of your actual shards from the master node.

In order to assign these shards, you need to run a new instance of elasticsearch to create a secondary node to carry the data replicas.

EDIT: Sometimes the unassigned shards belongs to indexes that have been deleted making them orphan shards that will never assign regardless of adding nodes or not. But it's not the case here!

eliasah
  • 35,948
  • 8
  • 110
  • 142
  • Thanks, I think I got. These shards are unassigned due max number of shards per node? – user3175226 May 15 '14 at 13:27
  • You're welcome. and how much is the max number of shards per node? – eliasah May 15 '14 at 13:40
  • `index.routing.allocation.total_shards_per_node = -1` (default) – Leabdalla May 16 '14 at 15:02
  • I have 1800 indices, some with 2 shards, some others with 10 shards. all this is distributed to 4 data machines with 8gb ram and 80gb ssd – Leabdalla May 16 '14 at 15:03
  • How many nodes do you have? Can you post a screenshot from you elasticsearch-head plugin? – eliasah May 16 '14 at 15:05
  • i add a secondary node but two elasticsearchs had conflict version so i removed it and added a new one. And then, my cluster has status yellow with unassigned_shards. What should i do – biolinh Mar 30 '15 at 15:10
  • @biolinh I think that you would want to ask a new question about that. It seems like another issue – eliasah Mar 30 '15 at 15:16
  • Not always. Sometimes the unassigned shards belongs to indexes that have been deleted making them orphan shards that will never assign regardless of adding nodes or not. See @Alcanzar 's comment below – hubbardr Apr 22 '15 at 14:56
  • That was not the issue here even thought what you wrote is totally correct! This case was just unassigned replicas because there were no other node to carry them. – eliasah Apr 22 '15 at 15:02
11

The only thing that worked for me was changing the number_of_replicas (I had 2 replicas, so I changed it to 1 and then changed back to 2).

First:

PUT /myindex/_settings
{
    "index" : {
        "number_of_replicas" : 1
     }
}

Then:

PUT /myindex/_settings
{
    "index" : {
        "number_of_replicas" : 2
     }
}
Edi
  • 909
  • 3
  • 9
  • 20
  • 2
    I had about 20 unassigned shards and one empty node (of six). Setting 'number_of_replicas' of just one of them to 1, then back to 2, seemed to knock things loose, and *all* the unassigned replicas moved over to the empty node. – Rodney Gitzel Oct 13 '15 at 15:12
4

The first 2 points of the answer by Alcanzar did it for me, but I had to add

"allow_primary" : true

like so

curl -XPOST http://localhost:9200/_cluster/reroute?pretty -d '{
  "commands": [
    {
      "allocate": {
        "index": ".marvel-2014.05.21",
        "shard": 0,
        "node": "SOME_NODE_HERE",
        "allow_primary": true
      }
    }
  ]
}'
Saeed Zhiany
  • 1,923
  • 7
  • 25
  • 34
dazl
  • 91
  • 7
0

With newer ES versions this should do the trick (run in Kibana DevTools):

PUT /_cluster/settings
{
  "transient" : {
    "cluster.routing.rebalance.enable" : "all"
  }
}

However, this won't fix the root cause. In my case there was lots of unassigned shards because default replica size was 1 but actually I was only using single node. So I also added to my elasticsearch.yml this line:

index.number_of_replicas: 0
JohnP
  • 956
  • 2
  • 13
  • 26
-1

Check that the versions of ElasticSearch on each node are the same. If they are not, then ES will not allocate replica copies of the index to 'older' nodes.

Using @Alcanzar's answer, you can get some diagnostic error messages back:

curl -XPOST 'http://localhost:9200/_cluster/reroute?pretty' -d '{
  "commands": [
    {
      "allocate": {
        "index": "logstash-2016.01.31",
        "shard": 1,
        "node": "arc-elk-es3",
        "allow_primary": true
      }
    }
  ]
}'

result is:

{
  "error" : "ElasticsearchIllegalArgumentException[[allocate] allocation of
            [logstash-2016.01.31][1] on node [arc-elk-es3]
            [Xn8HF16OTxmnQxzRzMzrlA][arc-elk-es3][inet[/172.16.102.48:9300]]{master=false} is not allowed, reason:
            [YES(shard is not allocated to same node or host)]
            [YES(node passes include/exclude/require filters)]
            [YES(primary is already active)]
            [YES(below shard recovery limit of [2])]
            [YES(allocation disabling is ignored)]
            [YES(allocation disabling is ignored)]
            [YES(no allocation awareness enabled)]
            [YES(total shard limit disabled: [-1] <= 0)]
            *** [NO(target node version [1.7.4] is older than source node version [1.7.5]) ***
            [YES(enough disk for shard on node, free: [185.3gb])]
            [YES(shard not primary or relocation disabled)]]",
  "status" : 400
}

How to determine the version number of ElasticSearch:

adminuser@arc-elk-web:/var/log/kibana$ curl -XGET 'localhost:9200'
{
  "status" : 200,
  "name" : "arc-elk-web",
  "cluster_name" : "elasticsearch",
  "version" : {
    "number" : "1.7.5",
    "build_hash" : "00f95f4ffca6de89d68b7ccaf80d148f1f70e4d4",
    "build_timestamp" : "2016-02-02T09:55:30Z",
    "build_snapshot" : false,
    "lucene_version" : "4.10.4"
  },
  "tagline" : "You Know, for Search"
}

In my case, I setup the apt-get repository incorrectly and they got out of sync on the different servers. I corrected it on all the servers with:

echo "deb http://packages.elastic.co/elasticsearch/1.7/debian stable main" | sudo tee -a /etc/apt/sources.list

and then the usual:

sudo apt-get update
sudo apt-get upgrade

and a final server reboot.

Guy
  • 9,125
  • 7
  • 35
  • 42