hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: Should we change the default value of hbase.regionserver.flushlogentries for 0.21?
Date Sun, 15 Nov 2009 07:22:13 GMT
That sync at the end of a RPC is my doing. You dont want to sync every
_EDIT_, after all, the previous definition of the word "edit" was each
KeyValue.  So we could be calling sync for every single column in a
row. Bad stuff.

In the end, if the regionserver crashes during a batch put, we will
never know how much of the batch was flushed to the WAL. Thus it makes
sense to only do it once and get a massive, massive, speedup.

On Sat, Nov 14, 2009 at 9:45 PM, stack <stack@duboce.net> wrote:
> I'm for leaving it as it is, at every 100 edits -- maybe every 10 edits?
> Speed stays as it was.  We used to lose MBs.  By default, we'll now lose 99
> or 9 edits max.
> We need to do some work bringing folks along regardless of what we decide.
> Flush happens at the end of the put up in the regionserver.  If you are
> doing a batch of commits -- e.g. using a big write buffer over on your
> client -- the puts will only be flushed on the way out after the batch put
> completes EVEN if you have configured hbase to sync every edit (I ran into
> this this evening.  J-D sorted me out).  We need to make sure folks are up
> on this.
> St.Ack
> On Sat, Nov 14, 2009 at 4:37 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>> Hi dev!
>> Hadoop 0.21 now has a reliable append and flush feature and this gives
>> us the opportunity to review some assumptions. The current situation:
>> - Every edit going to a catalog table is flushed so there's no data loss.
>> - The user tables edits are flushed every
>> hbase.regionserver.flushlogentries which by default is 100.
>> Should we now set this value to 1 in order to have more durable but
>> slower inserts by default? Please speak up.
>> Thx,
>> J-D

View raw message