3

I am a complete newbie at Cassandra and am just setting it up and playing around with it and testing different scenarios using cqlsh.

I currently have 4 nodes in 2 datacenters something like this (with proper IPs of course):

a.b.c.d=DC1:RACK1 a.b.c.d=DC1:RACK1 a.b.c.d=DC2:RACK1 a.b.c.d=DC2:RACK1

default=DCX:RACKX

Everything seems to make sense so far except that I brought down a node on purpose just to see the resulting behaviour and I notice that I can no longer query/insert data on the remaining nodes as it results in "Unable to complete request: one or more nodes were unavailable."

I get that a node is unavailable (I did that on purpose), but isnt one of the main points of distributed DB is to continue to support functionalities even as some nodes go down? Why does bringing one node down put a complete stop to everything?

What am I missing?

Any help would be greatly appreciated!!

user3376961
  • 747
  • 1
  • 10
  • 17

2 Answers2

3

You're correct in assuming that one node down should still allow you to query the cluster, but there are a few things to consider.

I'm assuming that "nodetool status" returns the expected results for that DC (i.e. "UN" for the UP node, "DN" for the DOWNed node)

Check the following:

  • Connection's Consistency level (default is ONE)
  • Keyspace replication strategy and factor (default is Simple, rack/dc unaware)
    • In cqlsh, "describe keyspace "

Note that if you've been playing around with replication factor you'll need to run a "nodetool repair" on the nodes.

More reading here

Blake Atkinson
  • 1,160
  • 9
  • 12
2

Is it possible that you did not set the replication factor on your keyspace with a value greater than 1? For example:

CREATE KEYSPACE "Excalibur" WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : 2, 'dc2' : 2};

Will configure your keyspace such that data is replicated to 2 nodes in each dc1 and dc2 datacenters.

If your replication factor is 1 and a node goes down that owns the data you are querying you will not be able to retrieve the data and C* will fail fast with an unavailable error. In general if C* detects that the consistency level cannot be met to service your query it will fail fast.

Andy Tolbert
  • 10,940
  • 1
  • 26
  • 43