hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: write throughput in cassandra, understanding hbase
Date Tue, 22 Jan 2013 19:03:00 GMT
Where do you see that HBase is doing only 2-3k writes/s?
How was the data distributed? Was the table split?
Cassandra uses a random partitioner by default, which will nicely distribute the data over
the cluster but won't allow to perform range scans over your data.
HBase always partitions by key ranges, so that the keys can the range scanned. If that is
not done correctly and you create monotonically increasing keys, you'll hotspot a single region

Even then, you can do more than this on single RegionServer.

Also note that many of the benchmarks have agendas and cherry pick the results.
They probably "forgot" to disabled Nagle's and to distribute the table correctly.

-- Lars

 From: S Ahmed <sahmed1020@gmail.com>
To: user@hbase.apache.org 
Sent: Tuesday, January 22, 2013 10:38 AM
Subject: RE: write throughput in cassandra, understanding hbase
I've read articles online where I see cassandra doing like 20K writers per
second, and hbase around 2-3K.

I understand both systems have their strenghts, but I am curious as to what
is holding hbase from reaching similiar results?

Is it HDFS that is the issue?  Or hbase does certain things (to its
advantage) that slows the write path down?
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message