Good afternoon.
In a production environment we use Cassandra 2.0.7. Initially we were enough one node (cass-05, the local IP-address 192.168.0.5). There is now a need for a second node (cass-06, the local IP-address 192.168.0.6). For the second node (cass-06) have a separate server. Cassandra settings on cass-06 are completely analogous to the cass-05. Used NetworkTopologyStrategy replication strategy. Each node is configured on it's own rack and data center with 1 copy of the data (rack1, DC1: 1 for cass-05 and rack2, DC2: 1 for cass-06).
1TB of disk space is available for Cassandra on each server. On the server cass-05 have 600Gb of real data.
On the server cass-06 we run utility 'nodetool rebuild':
#./nodetool -h192.168.0.6 rebuild -- DC1
Cassandra on cass-06 begins to create a large number of temporary files for the tables that it, in theory, should be removed. However, for some reason it does not. 9-12 hours through the entire 1TB disk space occupied by these temporary tables, which leads to malfunction node. After restarting the Cassandra on the cass-06 node the disk space is occupied only 150Gb.
During the utility 'nodetool rebuild' node cass-06 is involved in write/read as well as cass-05.
Thanks for any help.