I have a 4 node Cassandra cluster that didn't see a repair() for about 8 months, in between administrators. It doesn't see much in the way of deletes. I've noticed that when I run nodetool repair, the system will not accept new data, and nobody can connect with cqlsh until the repair is completed. Is it normal for repair to cause downtime?
Asked
Active
Viewed 261 times
0
-
which version of cassandra you are using. I have 2.0.8 4 node cassandra custer and have not faced such issues – undefined_variable Aug 22 '15 at 07:06
-
We're currently using 2.1 – Charles Cowart Aug 22 '15 at 07:44
-
Are you running repair on one node at a time or all nodes at the same time? What error do they get from cqlsh? Do the repairs succeed? Once repaired, does it still happen when you run repair again? – Jim Meyer Aug 22 '15 at 11:12
-
What quantity of data are we talking about ? Did you check if the memory was not already short before running the repair ? You may simply need more RAM , because a repair operation is very costly since it require network communication and trigger multiple compactions. If you are using size tiered compaction it requires up to 50% more space than the size of the data just to perform every time – sam Aug 22 '15 at 13:25
-
best practice is to run 'nodetool repair -pr' sequentially on all the nodes – LHWizard Aug 25 '15 at 20:05