Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B7A6910C18 for ; Fri, 29 Nov 2013 11:29:49 +0000 (UTC) Received: (qmail 32303 invoked by uid 500); 29 Nov 2013 11:29:46 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 32283 invoked by uid 500); 29 Nov 2013 11:29:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 32275 invoked by uid 99); 29 Nov 2013 11:29:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Nov 2013 11:29:39 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of artur.kronenberg@openmarket.com designates 81.187.36.3 as permitted sender) Received: from [81.187.36.3] (HELO puma.mxtelecom.com) (81.187.36.3) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Nov 2013 11:29:33 +0000 Received: from glide.lon.openmarket.com ([10.9.64.115]) by puma.mxtelecom.com with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.80.1) (envelope-from ) id 1VmMG0-0006Xd-OZ for user@cassandra.apache.org; Fri, 29 Nov 2013 11:29:12 +0000 Message-ID: <52987A87.8020705@openmarket.com> Date: Fri, 29 Nov 2013 11:29:11 +0000 From: Artur Kronenberg User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: user@cassandra.apache.org Subject: Re: reads and compression References: In-Reply-To: Content-Type: multipart/alternative; boundary="------------040307080008060701080307" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------040307080008060701080307 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Hi John, I am trying again :) The way I understand it is that compression gives you the advantage of having to use way less IO and rather use CPU. The bottleneck of reads is usually the IO time you need to read the data from disk. As a figure, we had about 25 reads/s reading from disk, while we get up to 3000 reads/s when we have all of it in cache. So having good compression reduces the amount you have to read from disk. Rather you may spend a little bit more time decompressing data, but this data will be in cache anyways so it won't matter. Cheers On 29/11/13 01:09, John Sanda wrote: > This article[1] cites gains in read performance can be achieved when > compression is enabled. The more I thought about it, even after > reading the DataStax docs about reads[2], I realized I do not > understand how compression improves read performance. Can someone > provide some details on this? > > Is the compression offsets map still used if compression is disabled > for a table? If so what is its rate of growth like as compared to the > growth of the map when compression is enabled? > > [1] whats-new-in-cassandra-1-0-compression > > [2] about reads > > > Thanks > > - John --------------040307080008060701080307 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit
Hi John,

I am trying again :)

The way I understand it is that compression gives you the advantage of having to use way less IO and rather use CPU. The bottleneck of reads is usually the IO time you need to read the data from disk. As a figure, we had about 25 reads/s reading from disk, while we get up to 3000 reads/s when we have all of it in cache. So having good compression reduces the amount you have to read from disk. Rather you may spend a little bit more time decompressing data, but this data will be in cache anyways so it won't matter.

Cheers

On 29/11/13 01:09, John Sanda wrote:
This article[1] cites gains in read performance can be achieved when compression is enabled. The more I thought about it, even after reading the DataStax docs about reads[2], I realized I do not understand how compression improves read performance. Can someone provide some details on this?

Is the compression offsets map still used if compression is disabled for a table? If so what is its rate of growth like as compared to the growth of the map when compression is enabled?

--------------040307080008060701080307--