cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juho Mäkinen (JIRA) <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-10757) Cluster migration with sstableloader requires significant compaction time
Date Mon, 23 Nov 2015 07:45:11 GMT
Juho Mäkinen created CASSANDRA-10757:
----------------------------------------

             Summary: Cluster migration with sstableloader requires significant compaction
time
                 Key: CASSANDRA-10757
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10757
             Project: Cassandra
          Issue Type: Improvement
          Components: Compaction, Streaming and Messaging
            Reporter: Juho Mäkinen
            Priority: Minor
             Fix For: 2.1.11


When sstableloader is used to migrate data from a cluster into another the loading creates
a lot more data and a lot more sstable files than what the original cluster had.

For example in my case a 62 node with 16 TiB of data in 80000 sstables was sstableloaded into
another cluster with 36 nodes and this resulted with 42 TiB of used data in a whopping 350000
sstables.

The sstableloadering process itself was relatively fast (around 8 hours), but in the result
the destination cluster needs approximately two weeks of compaction (these are C4.4xlarge
instances, 16 cores each, compaction running on 14 cores, 9000 iops, 250 MiB/sec sustained
disk bandwidth.)

Could sstableloader process somehow improved to make this kind of migrations less painful
and faster?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message