cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Theroux <>
Subject Cassandra compression not working?
Date Mon, 24 Sep 2012 21:00:10 GMT

We are running into an unusual situation that I'm wondering if anyone has any insight on.
 We've been running a Cassandra cluster for some time, with compression enabled on one column
family in which text documents are stored.  We enabled compression on the column family, utilizing
the SnappyCompressor and a 64k chunk length.

It was recently discovered that Cassandra was reporting a compression ratio of 0.  I took
a snapshot of the data and started a cassandra node in isolation to investigate.

Running nodetool scrub, or nodetool upgradesstables had little impact on the amount of data
that was being stored.

I then disabled compression and ran nodetool upgradesstables on the column family.  Again,
not impact on the data size stored.

I then reenabled compression and ran nodetool upgradesstables on the column family.  This
resulting in a 60% reduction in the data size stored, and Cassandra reporting a compression
ration of about .38.

Any idea what is going on here?  Obviously I can go through this process in production to
enable compression, however, any idea what is currently happening and why new data does not
appear to be compressed?

Any insights are appreciated,
View raw message