Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of tlipcon@gmail.com designates
 209.85.160.172 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:sender:in-reply-to:references:from:date
         :x-google-sender-auth:message-id:subject:to:content-type;
        b=sWECVbR3+zam6FzNxH0cwSPSZb0jxDKY6ecgn6HNy4omr+9yQNgYf8W6G4ivbOZKMQ
         wWC6AV/A7CfNGZAPiNvB9hnYYuV3WZrEEnHZjkMTC0mX7x5HJohKNbE191qoYK6agl1U
         /zhEAhpFKcuzsUTHxfUfTaaFnYBUbrfHS3fVQ=
MIME-Version: 1.0
Sender: tlipcon@gmail.com
In-Reply-To: <AANLkTimnPshs9xHTN_iTQUFQCsnDN7S11h8ONy7=Lyiv@mail.gmail.com>
References: <AANLkTimuCNP1+jK9hDn=_Uf7henwQPa5qQco6KbHHf98@mail.gmail.com>
 <AANLkTi=8oBpuVqoqYepdcxsGGZB54JLw+u8iOaVDfLUy@mail.gmail.com>
 <AANLkTik+Yd5k2i6Y8nArUhgsJ8HJySLYxgzgexx1jRi9@mail.gmail.com>
 <AANLkTikWcbrk0Le+9N_4UNz-cCPedRSRJBix3rKKMUE3@mail.gmail.com>
 <AANLkTinCE6b6wFZW37VP86LoDQOwCPveCFpc193B_-R4@mail.gmail.com>
 <AANLkTimnPshs9xHTN_iTQUFQCsnDN7S11h8ONy7=Lyiv@mail.gmail.com>
From: Todd Lipcon <todd@lipcon.org>
Date: Mon, 22 Nov 2010 13:52:07 -0800
Message-ID: <AANLkTinZowutYf4o5xMQqkoVWCfuKgbEHvRuwT3mRM6a@mail.gmail.com>
Subject: Re: cassandra vs hbase summary (was facebook messaging)
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=0014853d202eacacd90495ab45bb

--0014853d202eacacd90495ab45bb
Content-Type: text/plain; charset=ISO-8859-1

On Mon, Nov 22, 2010 at 1:26 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> For cassandra all writes must be transmitted to all replicas.
> CASSANDRA-1314 does not change how writes happen. Write operations
> will still effect cache (possibly evicting things if cache is full).
> Reads however will prefer a single node of it's possible replicas.
> This should cause better cache utilization and less duplication for
> those using READ.ONE with lower read repair settings.
>
> It is also worth pointing out that the HBase cache is on entire hdfs
> blocks,


Nope, the HBase cache is on HFile blocks, which are typically 64KB.


> while Cassandra can cache on keys (key cache) or a key and all
> it's columns (row cache). This has some deep implications based on how
> random your reads are. Even with Cassandra's normal cache duplication
> having more fined grained caches, of or rows rather then blocks, could
> mean that they are more efficient anyway.
>

--0014853d202eacacd90495ab45bb
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Mon, Nov 22, 2010 at 1:26 PM, Edward Capriolo <span dir=3D"ltr">&lt;<a h=
ref=3D"mailto:edlinuxguru@gmail.com">edlinuxguru@gmail.com</a>&gt;</span> w=
rote:<br><div class=3D"gmail_quote"><blockquote class=3D"gmail_quote" style=
=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

For cassandra all writes must be transmitted to all replicas.<br>
CASSANDRA-1314 does not change how writes happen. Write operations<br>
will still effect cache (possibly evicting things if cache is full).<br>
Reads however will prefer a single node of it&#39;s possible replicas.<br>
This should cause better cache utilization and less duplication for<br>
those using READ.ONE with lower read repair settings.<br>
<br>
It is also worth pointing out that the HBase cache is on entire hdfs<br>
blocks,</blockquote><div><br></div><div>Nope, the HBase cache is on HFile b=
locks, which are typically 64KB.</div><div>=A0</div><blockquote class=3D"gm=
ail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-le=
ft:1ex;">

 while Cassandra can cache on keys (key cache) or a key and all<br>
it&#39;s columns (row cache). This has some deep implications based on how<=
br>
random your reads are. Even with Cassandra&#39;s normal cache duplication<b=
r>
having more fined grained caches, of or rows rather then blocks, could<br>
mean that they are more efficient anyway.<br>
</blockquote></div><br>

--0014853d202eacacd90495ab45bb--