Hi John,

I am trying again :)

The way I understand it is that compression gives you the advantage of having to use way less IO and rather use CPU. The bottleneck of reads is usually the IO time you need to read the data from disk. As a figure, we had about 25 reads/s reading from disk, while we get up to 3000 reads/s when we have all of it in cache. So having good compression reduces the amount you have to read from disk. Rather you may spend a little bit more time decompressing data, but this data will be in cache anyways so it won't matter.

Cheers

On 29/11/13 01:09, John Sanda wrote:
This article[1] cites gains in read performance can be achieved when compression is enabled. The more I thought about it, even after reading the DataStax docs about reads[2], I realized I do not understand how compression improves read performance. Can someone provide some details on this?

Is the compression offsets map still used if compression is disabled for a table? If so what is its rate of growth like as compared to the growth of the map when compression is enabled?

[1] whats-new-in-cassandra-1-0-compression

Thanks

- John