incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@lipcon.org>
Subject Re: cassandra vs hbase summary (was facebook messaging)
Date Tue, 23 Nov 2010 01:17:58 GMT
Seems accurate to me. One small correction - the daemon in HBase that serves
regions is known as a "region server" rather than a region master. The RS is
the equivalent of the tablet server in Bigtable terminology.

-Todd

On Mon, Nov 22, 2010 at 4:50 PM, David Jeske <davidj@gmail.com> wrote:

> This is my second attempt at a summary of Cassandra vs HBase consistency
> and performance for an hbase acceptable workload. I think these tricky
> subtlties are hard to understand, yet it's helpful for the community to
> understand them. I'm not trying to state my own facts (or opinion) but
> merely summarize what I've read.
>
> Again, please correct any facts which are wrong. Thanks for the kind and
> thoughtful responses!
>
> *1) Cassandra can't replicate the consistency situation of HBase.* Namely
> that once a write is finished that new value will either always appear or
> never appear.
>
>  [In Cassandra]Provided at least one node receives the write, it will
> eventually be written to all replicas. A failure to meet the requested
> ConsistencyLevel is just that; not a failure to write the data itself. Once
> the write is received by a node, it will eventually reach all replicas,
> there is no roll back. - Nick Telford [ref<http://www.mail-archive.com/user@cassandra.apache.org/msg07398.html>
> ]
>
> In Cassandra (N3/W3/R1, N3/W2/R2, or N3/W3/R3), a write can occur to a
> single node, fail to meet the write-consistency request, readback can show
> the old value, but later show the new value once the write that did occur is
> propagated.
>
> [In HBase]Once a region master accepts a write, it has been flushed to the
> HDFS log. If the replica server goes down while writing, if the write was
> finished to any copies of the HDFS log, the new region master will accept
> and propagate the write, if not, the write will never appear.
>
> *2) Cassandra has a less efficient use of memory, particularly for data
> pinned in memory. *With 3 replicas on Cassandra, each element of data
> pinned in-memory is kept on 3 servers, wheras in hbase only region masters
> keep the data in memory, so there is only one-copy of each data element.
>
> CASSANDRA-1314 <https://issues.apache.org/jira/browse/CASSANDRA-1314>provides an
opportunity to allow a 'soft master', where reads prefer a
> particular replica. Combined with a disable of read-repair this should allow
> for more efficient memory usage for data pinned or cached in memory. #1 is
> still true, namely that a write may only occur to a node which is not the
> soft-master, and that new new value may not appear for a while and then
> eventually appear. However, with N3/W3/R1, once a write appears at the
> soft-master it will remain, so as long as the soft-master preference can be
> honored it will be closer to HBase's consistency.
>
> *3) HBase can't match the row-availability situation of Cassandra
> (N3/W2/R2).* In the face of a single machine failure, if it is a region
> master, those keys are offline in HBase until a new region master is elected
> and brought online. In Cassandra, no single node failure causes the data to
> become unavailable.
>
> *4) Two Cassandra configurations are closest to the **consistency
> situation of hbase, and provide slightly different node failure
> characteristics.* (note, #1 above means Cassandra can't truly reach the
> same consistency situation as HBase)
>
> In Cassandra (N3/W3/R1), a node failure will disallow writes to a keyrange
> during the replica rebuild, while still allowing reads.
> In Cassandra (N3/W2-3/R2), a node failure will allow both reads and writes
> to continue, while requiring uncached reads to contact two servers.
> (Requiring a response from two servers may increase common case latency, but
> may hide latency from GC spikes, since any two of the three may respond)
> In HBase, if an HDFS node fails, both reads and writes continue; while when
> a region-master fails, both reads and writes are stalled until the region
> master is replaced.
>
>
> Was that a better summary? Is it closer to correct?
>
>
>
>
>
>

Mime
View raw message