Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of artur.kronenberg@openmarket.com
 designates 81.187.36.3 as permitted sender)
Message-ID: <52987A87.8020705@openmarket.com>
Date: Fri, 29 Nov 2013 11:29:11 +0000
From: Artur Kronenberg <artur.kronenberg@openmarket.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:24.0) Gecko/20100101 Thunderbird/24.1.1
MIME-Version: 1.0
To: user@cassandra.apache.org
Subject: Re: reads and compression
References: 
 <CA+BDQ7z3vgmd=sPNnfYtue8xDTUj_Kuuf6et5yaqe0t722CgtA@mail.gmail.com>
In-Reply-To: 
 <CA+BDQ7z3vgmd=sPNnfYtue8xDTUj_Kuuf6et5yaqe0t722CgtA@mail.gmail.com>
Content-Type: multipart/alternative;
 boundary="------------040307080008060701080307"

This is a multi-part message in MIME format.
--------------040307080008060701080307
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Hi John,

I am trying again :)

The way I understand it is that compression gives you the advantage of 
having to use way less IO and rather use CPU. The bottleneck of reads is 
usually the IO time you need to read the data from disk. As a figure, we 
had about 25 reads/s reading from disk, while we get up to 3000 reads/s 
when we have all of it in cache. So having good compression reduces the 
amount you have to read from disk. Rather you may spend a little bit 
more time decompressing data, but this data will be in cache anyways so 
it won't matter.

Cheers

On 29/11/13 01:09, John Sanda wrote:
> This article[1] cites gains in read performance can be achieved when 
> compression is enabled. The more I thought about it, even after 
> reading the DataStax docs about reads[2], I realized I do not 
> understand how compression improves read performance. Can someone 
> provide some details on this?
>
> Is the compression offsets map still used if compression is disabled 
> for a table? If so what is its rate of growth like as compared to the 
> growth of the map when compression is enabled?
>
> [1] whats-new-in-cassandra-1-0-compression 
> <http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression>
> [2] about reads 
> <http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html?pagename=docs&version=1.2&file=index#cassandra/dml/dml_about_reads_c.html>
>
> Thanks
>
> - John


--------------040307080008060701080307
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">Hi John,<br>
      <br>
      I am trying again :) <br>
      <br>
      The way I understand it is that compression gives you the
      advantage of having to use way less IO and rather use CPU. The
      bottleneck of reads is usually the IO time you need to read the
      data from disk. As a figure, we had about 25 reads/s reading from
      disk, while we get up to 3000 reads/s when we have all of it in
      cache. So having good compression reduces the amount you have to
      read from disk. Rather you may spend a little bit more time
      decompressing data, but this data will be in cache anyways so it
      won't matter. <br>
      <br>
      Cheers<br>
      <br>
      On 29/11/13 01:09, John Sanda wrote:<br>
    </div>
    <blockquote
cite="mid:CA+BDQ7z3vgmd=sPNnfYtue8xDTUj_Kuuf6et5yaqe0t722CgtA@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <div dir="ltr">
        <div>This article[1] cites gains in read performance can be
          achieved when compression is enabled. The more I thought about
          it, even after reading the DataStax docs about reads[2], I
          realized I do not understand how compression improves read
          performance. Can someone provide some details on this?</div>
        <div><br>
        </div>
        <div>Is the compression offsets map still used if compression is
          disabled for a table? If so what is its rate of growth like as
          compared to the growth of the map when compression is enabled?</div>
        <div>
          <div>
            <div>
              <div><br>
              </div>
              [1]&nbsp;<a moz-do-not-send="true"
href="http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression">whats-new-in-cassandra-1-0-compression</a>
              <div>[2]&nbsp;<a moz-do-not-send="true"
href="http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html?pagename=docs&amp;version=1.2&amp;file=index#cassandra/dml/dml_about_reads_c.html">about
                  reads</a></div>
              <div><br>
              </div>
              <div>Thanks<br>
                <br>
                - John
              </div>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br>
  </body>
</html>

--------------040307080008060701080307--