cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: BulkLoading SSTables and compression
Date Sun, 01 Jul 2012 18:29:51 GMT
When the data is streamed into the cluster by the bulk loader it is compressed on the receiving
end (if the target CF has compression enabled).

If you are able to reproduce this  can you create a ticket on


Aaron Morton
Freelance Developer

On 28/06/2012, at 10:00 PM, Andy Cobley wrote:

> My (limited) experience of moving form 0.8 to 1.0 is that you do have to use rebuildsstables.
 I'm guessing BlukLoading is bypassing the compression ?
> Andy
> On 28 Jun 2012, at 10:53, jmodha wrote:
>> Hi,
>> We are migrating our Cassandra cluster from v1.0.3 to v1.1.1, the data is
>> migrated using SSTableLoader to an empty Cassandra cluster.
>> The data in the source cluster (v1.0.3) is uncompressed and the target
>> cluster (1.1.1) has the column family created with compression turned on.
>> What we are seeing is that once the data has been loaded into the target
>> cluster, the size is similar to the data in the source cluster. Our
>> expectation is that since we have turned on compression in the target
>> cluster, the amount of data would be reduced.
>> We have tried running the "rebuildsstables" nodetool command on a node after
>> data has been loaded and we do indeed see a huge reduction in size e.g. from
>> 30GB to 10GB for a given column family. We were hoping to see this at the
>> point of loading the data in via the SSTableLoader.
>> Is this behaviour expected? 
>> Do we need to run the rebuildsstables command on all nodes to actually
>> compress the data after it has been streamed in?
>> Thanks.
>> --
>> View this message in context:
>> Sent from the mailing list archive at
> The University of Dundee is a Scottish Registered Charity, No. SC015096.

View raw message