0

I need to set up a Cassandra cluster with 3 nodes and RF=1. I want to set up a cron job, that runs node tool repairs once a week, on all three nodes at the same time. Will this affect the data that is being sent to the cluster while the node tool repair is going on? Will the node under going nodetool repair able to serve the new requests?

vamsi
  • 305
  • 2
  • 13

1 Answers1

1

What nodetool repair does is it compares the data between all the holders of the data piece and resolves inconsistencies.

With RF=1 means you only store one copy of data = no reserve copies = nothing to compare with = repair operation with RF=1 does nothing.

single-node repair is special cased to be a no-op. (c) CASSANDRA-1691

I recommend you to keep RF=3 (2 proved to be difficult to manage in some cases, e.g. to support losing nodes + being available, 3 allows you to have consistent view on data + lose 1 node)

Ivan
  • 3,565
  • 14
  • 20
  • Hi @Ivan Thnx for the reply. If I use RF=2( cant use RF=3 because of some constraints), then i run node tool repair on all the 3 nodes at once, would it affect the reads or writes in the cluster? – vamsi Nov 15 '16 at 04:37
  • @vamsi you will be able to send reads/writes to the cluster. The latency will be a worse as repair process needs to hit the disk too so your r/w activities will compete with the repair. By default the command will repair all of your nodes one after another which makes latency problems smaller. You can read up in the [docs](https://docs.datastax.com/en/cassandra/2.0/cassandra/tools/toolsRepair.html). Please consider using RF=3 :) – Ivan Nov 15 '16 at 06:39