incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Jeske <dav...@gmail.com>
Subject Re: cassandra vs hbase summary (was facebook messaging)
Date Tue, 23 Nov 2010 00:50:40 GMT
This is my second attempt at a summary of Cassandra vs HBase consistency and
performance for an hbase acceptable workload. I think these tricky subtlties
are hard to understand, yet it's helpful for the community to understand
them. I'm not trying to state my own facts (or opinion) but merely summarize
what I've read.

Again, please correct any facts which are wrong. Thanks for the kind and
thoughtful responses!

*1) Cassandra can't replicate the consistency situation of HBase.* Namely
that once a write is finished that new value will either always appear or
never appear.

[In Cassandra]Provided at least one node receives the write, it will
eventually be written to all replicas. A failure to meet the requested
ConsistencyLevel is just that; not a failure to write the data itself. Once
the write is received by a node, it will eventually reach all replicas,
there is no roll back. - Nick Telford
[ref<http://www.mail-archive.com/user@cassandra.apache.org/msg07398.html>
]

In Cassandra (N3/W3/R1, N3/W2/R2, or N3/W3/R3), a write can occur to a
single node, fail to meet the write-consistency request, readback can show
the old value, but later show the new value once the write that did occur is
propagated.

[In HBase]Once a region master accepts a write, it has been flushed to the
HDFS log. If the replica server goes down while writing, if the write was
finished to any copies of the HDFS log, the new region master will accept
and propagate the write, if not, the write will never appear.

*2) Cassandra has a less efficient use of memory, particularly for data
pinned in memory. *With 3 replicas on Cassandra, each element of data pinned
in-memory is kept on 3 servers, wheras in hbase only region masters keep the
data in memory, so there is only one-copy of each data element.

CASSANDRA-1314 <https://issues.apache.org/jira/browse/CASSANDRA-1314>provides
an opportunity to allow a 'soft master', where reads prefer a
particular replica. Combined with a disable of read-repair this should allow
for more efficient memory usage for data pinned or cached in memory. #1 is
still true, namely that a write may only occur to a node which is not the
soft-master, and that new new value may not appear for a while and then
eventually appear. However, with N3/W3/R1, once a write appears at the
soft-master it will remain, so as long as the soft-master preference can be
honored it will be closer to HBase's consistency.

*3) HBase can't match the row-availability situation of Cassandra
(N3/W2/R2).* In the face of a single machine failure, if it is a region
master, those keys are offline in HBase until a new region master is elected
and brought online. In Cassandra, no single node failure causes the data to
become unavailable.

*4) Two Cassandra configurations are closest to the **consistency situation
of hbase, and provide slightly different node failure
characteristics.*(note, #1 above means Cassandra can't truly reach the
same consistency
situation as HBase)

In Cassandra (N3/W3/R1), a node failure will disallow writes to a keyrange
during the replica rebuild, while still allowing reads.
In Cassandra (N3/W2-3/R2), a node failure will allow both reads and writes
to continue, while requiring uncached reads to contact two servers.
(Requiring a response from two servers may increase common case latency, but
may hide latency from GC spikes, since any two of the three may respond)
In HBase, if an HDFS node fails, both reads and writes continue; while when
a region-master fails, both reads and writes are stalled until the region
master is replaced.


Was that a better summary? Is it closer to correct?

Mime
View raw message