hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Friso van Vollenhoven <fvanvollenho...@xebia.com>
Subject Re: Sequential Inserts In HBASE.
Date Mon, 29 Nov 2010 12:40:21 GMT
You might want to have a look at this: http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html

Insertions into HBase go in to memory first (into the MemStore) and only get flushed to disk
periodically. When you do insertions sequentially in key order, your insert process will always
hit one region at a time, which is handled by one region server. As such, it will give poor
performance, because most of the region servers will be idling while only one is doing the
work. It's better when insertions are distributed evenly across the key space, so all region
servers get an equal share of work to do at the same time.



On 29 nov 2010, at 13:30, rajgopalv wrote:


Hi All,
I'm new to HBASE. I understand that HBASE keeps its data sorted in the
filesystem. So when we insert randomly, it takes time to sort. Where as when
we insert sequentially, there is no need for HBASE to sort.

But, i keep hearing from some of the users that, sequential inserts to HBASE
is the worst case thing. Why is that ?
--
View this message in context: http://old.nabble.com/Sequential-Inserts-In-HBASE.-tp30329923p30329923.html
Sent from the HBase User mailing list archive at Nabble.com<http://Nabble.com>.



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message