cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Ferland <>
Subject Offline Compaction and Token Splitting
Date Thu, 07 May 2015 19:07:21 GMT
I have an ideal for backups in my mind with Cassandra to dump each columnfamily to a directory
and use an offline process to compact them all into one sstable (or max sstable size set).
I have an ideal for restoration which involves a streaming read an sstable set and output
based on whether the data fits within a token range. The result of this is that I can store
a single copy of data that is effectively already repaired and can read from the specific
range that covers a node that I wish to restore. My first look at this was somewhat frustrated
by sstable code in the current versions have a strong reliance on the system keyspace.

Does anybody have any thoughts in regards to other things that might exist and fulfill this
(particularly offline collective compaction), have a desire for such tools, or have any useful
information for me before I attempt to build such beasts?

View raw message