I Have 2 cassandra clusters, on different datacenter (note that these are 2 different clusters, NOT a single cluster with multidc), and both clusters have the same keyspace and columnfamily models. I wish to copy data of columnfamily C from Cluster A to cluster B in the most efficient way. Some other ColumnFamily I was able to copy with get and put operations, since it was a time series and the keys sequential. But this other column family C, I coulnd copy. I'm using thrift and pycassa. I've ried the CQL COPY command, but unfortunately the CF is too large and I get a rpc_timeout. How can I accomplish this?
3 Answers
If you just want to do this as a one time thing, then take a snapshot and use the sstableloader to load that into the cluster. If you want to keep loading new data over time you will want to turn on incremental_backups, then take a snapshot to load for the initial data, and then periodically grab the sstables out of the incremental backups to sstableload to keep things up to date.
- 3,831
- 21
- 31
I don't have much knowledge on How to copy cassandra data from one cluster to another but For rpc_timeout error you can use
cqlsh --request-timeout 3600 <IP address>
use the above command to enter into Cql shell request-timeout by default in sec,you can increase if you want
Time to time I also need to copy data from one cassandra cluster to another.
I use this tool https://github.com/masumsoft/cassandra-exporter.
export.js
script exports data to a json files, import.js
script imports exported data to a cassandra. You can do it for all tables in specified keyspace or for a particular table only. Target keyspace and tables should exist before import.
In js script you can adjust batch size and readTimeout if you get "read timeout error".
UPDATE: After a hint by Alex Ott I tried DSBulk tool. It works great but only for one table per-run. If you want to process full keyspace you need a script that runs DSBulk for each table.
- 4,468
- 2
- 27
- 34
-
1Instead of it - look to DSBulk - it's heavily optimized, and have a lot of options to load/unload data, compress files, etc. – Alex Ott Sep 11 '20 at 07:54
-
Thanks! Will give it a try! – Stanislav Berkov Sep 11 '20 at 07:56