Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Mon, 23 Nov 2015 08:41:11 +0000 (UTC)
From: =?utf-8?Q?Juho_M=C3=A4kinen_=28JIRA=29?= <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12915249.1448264655000.157130.1448268071110@Atlassian.JIRA>
In-Reply-To: <JIRA.12915249.1448264655000@Atlassian.JIRA>
References: <JIRA.12915249.1448264655000@Atlassian.JIRA>
 <JIRA.12915249.1448264655696@arcas>
Subject: [jira] [Updated] (CASSANDRA-10757) Cluster migration with
 sstableloader requires significant compaction time
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


     [ https://issues.apache.org/jira/browse/CASSANDRA-10757?page=3Dcom.atl=
assian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Juho M=C3=A4kinen updated CASSANDRA-10757:
-------------------------------------
    Description:=20
When sstableloader is used to migrate data from a cluster into another the =
loading creates a lot more data and a lot more sstable files than what the =
original cluster had.

For example in my case a 62 node with 16 TiB of data in 80000 sstables was =
sstableloaded into another cluster with 36 nodes and this resulted with 42 =
TiB of used data in a whopping 350000 sstables.

The sstableloadering process itself was relatively fast (around 8 hours), b=
ut in the result the destination cluster needs approximately two weeks of c=
ompaction to be able to reduce the number of sstables back to the original =
levels. (The instances are C4.4xlarge in EC2, 16 cores each, compaction run=
ning on 14 cores. the EBS disks in each instance provide 9000 iops and max =
250 MiB/sec disk bandwidth.).

Could sstableloader process somehow improved to make this kind of migration=
s less painful and faster?

  was:
When sstableloader is used to migrate data from a cluster into another the =
loading creates a lot more data and a lot more sstable files than what the =
original cluster had.

For example in my case a 62 node with 16 TiB of data in 80000 sstables was =
sstableloaded into another cluster with 36 nodes and this resulted with 42 =
TiB of used data in a whopping 350000 sstables.

The sstableloadering process itself was relatively fast (around 8 hours), b=
ut in the result the destination cluster needs approximately two weeks of c=
ompaction to be able to reduce the number of sstables back to the original =
levels. (The instances are C4.4xlarge in EC2, 16 cores each, compaction run=
ning on 14 cores. the EBS disksin each instance provide 9000 iops and max 2=
50 MiB/sec disk bandwidth.).

Could sstableloader process somehow improved to make this kind of migration=
s less painful and faster?


> Cluster migration with sstableloader requires significant compaction time
> -------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10757
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1075=
7
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction, Streaming and Messaging
>            Reporter: Juho M=C3=A4kinen
>            Priority: Minor
>              Labels: sstableloader
>             Fix For: 2.1.11
>
>
> When sstableloader is used to migrate data from a cluster into another th=
e loading creates a lot more data and a lot more sstable files than what th=
e original cluster had.
> For example in my case a 62 node with 16 TiB of data in 80000 sstables wa=
s sstableloaded into another cluster with 36 nodes and this resulted with 4=
2 TiB of used data in a whopping 350000 sstables.
> The sstableloadering process itself was relatively fast (around 8 hours),=
 but in the result the destination cluster needs approximately two weeks of=
 compaction to be able to reduce the number of sstables back to the origina=
l levels. (The instances are C4.4xlarge in EC2, 16 cores each, compaction r=
unning on 14 cores. the EBS disks in each instance provide 9000 iops and ma=
x 250 MiB/sec disk bandwidth.).
> Could sstableloader process somehow improved to make this kind of migrati=
ons less painful and faster?


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)