elasticsearch - remove a second elasticsearch node and add an other node, get unassigned shards

Question

As a starter in Elasticsearch, I just use it for two weeks ago and I have just did a silly thing.

My Elasticsearch has one cluster with two nodes, one master-data node (version 1.4.2), one non-data node (version 1.1.1).There was conflict version when using, I decided to shutdown and delete the non-data node then, install another data node (version 1.4.2) See my image for easy imagine. node3 is named node2 then enter image description here

Then, I check elastic status

{ 
    "cluster_name":"elasticsearch",
    "status":"yellow",
    "timed_out":false,
    "number_of_nodes":2,
    "number_of_data_nodes":2,
    "active_primary_shards":725,
    "active_shards":1175,
    "relocating_shards":0,
    "initializing_shards":0,
    "unassigned_shards":273
}

Check the cluster state

curl -XGET http://localhost:9200/_cat/shards


    logstash-2015.03.25 2 p STARTED       3031  621.1kb 10.146.134.94 node1        
    logstash-2015.03.25 2 r UNASSIGNED
    logstash-2015.03.25 0 p STARTED       3084  596.4kb 10.146.134.94 node1        
    logstash-2015.03.25 0 r UNASSIGNED                                                     
    logstash-2015.03.25 3 p STARTED       3177  608.4kb 10.146.134.94 node1        
    logstash-2015.03.25 3 r UNASSIGNED                                                     
    logstash-2015.03.25 1 p STARTED       3099  577.3kb 10.146.134.94 node1       
    logstash-2015.03.25 1 r UNASSIGNED                      
    logstash-2014.12.30 4 r STARTED                     10.146.134.94 node2 
    logstash-2014.12.30 4 p STARTED         94  114.3kb 10.146.134.94 node1        
    logstash-2014.12.30 0 r STARTED        111  195.8kb 10.146.134.94 node2 
    logstash-2014.12.30 0 p STARTED        111  195.8kb 10.146.134.94 node1       
    logstash-2014.12.30 3 r STARTED        110    144kb 10.146.134.94 node2 
    logstash-2014.12.30 3 p STARTED        110    144kb 10.146.134.94 node1

I have read related question and tried to follow it but no luck. I also comment in the answer the error i got.

ElasticSearch: Unassigned Shards, how to fix?

https://t37.net/how-to-fix-your-elasticsearch-cluster-stuck-in-initializing-shards-mode.html

elasticsearch - what to do with unassigned shards

http://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-reroute.html#cluster-reroute

curl -XPOST 'localhost:9200/_cluster/reroute' -d '{
    "commands" : [ {
          "allocate" : {
              "index" : "logstash-2015.03.25", 
              "shard" : 4, 
              "node" : "node2", 
              "allow_primary" : true
          }
        }
    ]
}'

I get

"routing_nodes":{"unassigned":[{"state":"UNASSIGNED","primary":false,"node":null,
"relocating_node":null,"shard":0,"index":"logstash-2015.03.25"}

And I followed the answer in https://stackoverflow.com/a/23781013/1920536

curl -XPUT 'localhost:9200/_cluster/settings' -d '{
"transient" : {
    "cluster.routing.allocation.enable" : "all"
}
}'

but no affection.

what should i do ? Thank in advances.

Update: when I check pending task, it shows that:

{"tasks":[{"insert_order":88401,"priority":"HIGH","source":"shard-failed 
    ([logstash-2015.01.19][3], node[PVkS47JyQQq6G-lstUW04w], [R], s[INITIALIZING]),
    **reason [Failed to start shard, message** [RecoveryFailedException[[logstash-2015.01.19][3]: **Recovery failed from** [node1][_72bJJX0RuW7AyM86WUgtQ]
    [localhost][inet[/localhost:9300]]{master=true} into [node2][PVkS47JyQQq6G-lstUW04w]
    [localhost][inet[/localhost:9302]]{master=false}]; 
    nested: RemoteTransportException[[node1][inet[/localhost:9300]]
    [internal:index/shard/recovery/start_recovery]]; nested: RecoveryEngineException[[logstash-2015.01.19][3] Phase[2] Execution failed];
    nested: RemoteTransportException[[node2][inet[/localhost:9302]][internal:index/shard/recovery/prepare_translog]];
    nested: EngineCreationFailureException[[logstash-2015.01.19][3] **failed to create engine]; 
    nested: FileSystemException**[data/elasticsearch/nodes/0/indices/logstash-2015.01.19/3/index/_0.si: **Too many open files**]; ]]","executing":true,"time_in_queue_millis":53,"time_in_queue":"53ms"}]}

This is probably your problem, right at the bottom of the log: "Too many open files". You have to increase the number of files that Elasticsearch can have opened. How that's done depends on your operating system. — Magnus Bäck, Mar 30 '15 at 17:30
@MagnusBäck: Maybe, When I installed new ones, I pointed directory where to store index data as same as with node (v1.1.1) removed. So, how should I fix it? — biolinh, Mar 31 '15 at 06:32

Roopendra · Answer 1 · 2015-03-30T17:59:28.513

0

If you have two nodes like
1) Node-1 - ES 1.4.2
2) Node-2 - Es 1.1.1

Now follow these steps to debug.

1) Stop all elasticsearch instance from Node-2.
2) Install elasticsearch 1.4.2 in new elasticsearch node. Change elasticsearch.yml as master node configuration , specially these three config settings

 cluster.name: <Same as master node>
 node.name: < Node name for Node-2>
 discovery.zen.ping.unicast.hosts: <Master Node IP>

3) Restart Node-2 Elasticsearch.
4) Verify Node-1 logs.

edited Mar 30 '15 at 17:59

answered Mar 30 '15 at 17:53

Roopendra

7,221
16
59
84

Thank for your quick responder. You can see that, Master node (localhost:9300), ES1.1.1 (localhost:9301) and new install one (localhost:9302). So, I already removed elastic 1.1.1 and installed es1.4.2. Change cluster.name same master node, node name as node2. Both ES are installed in same server, should i need change discovery.zen.ping.unicast.hosts: ? Because, when I check node in this server, it shows information of both node. – biolinh Mar 31 '15 at 06:47
Id addition that, when i check node1 log, it runs forever with various logs such as: 2015.01.19][0]: Recovery failed from [node1]72bJJX0RuW7AyM86WUgtQ][localhost][inet[/localhost:9300]]{master=true} into [node2][PVkS47JyQQq6G-lstUW04w][localhost][inet[/localhost:9302]]{master=false}];reationFailureException[[logstash-2015.01.19][0] failed to create engine]; nested: FileSystemException[data/elasticsearch/nodes/0/indices/logstash-2015.01.19/0/index/_b.si: Too many open files]; ]] Maybe, It happens after I run "allocate" : { "index" : "logstash-2015.03.25", .... – biolinh Mar 31 '15 at 07:00

elasticsearch - remove a second elasticsearch node and add an other node, get unassigned shards

1 Answers1