hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@yahoo-inc.com>
Subject Re: HLog durabilty on the current and future Hadoop releases
Date Mon, 17 May 2010 07:51:36 GMT
Hi, Given the topic of this message, I'd like to point out that  
bookkeeper (HBASE-2315) provides a strong durability guarantee. We  
sync writes to disk on a quorum of machines. I don't think this  
feature is currently on the roadmap of hbase, though.


On May 17, 2010, at 6:01 AM, Tatsuya Kawano wrote:

> Hi,
> A few days ago, I had a discussion with other Japanese developers on
> hadoop-jp Google group. It was about HLog durability on the recent
> Hadoop releases (0.20.1, 0.20.2)  I never looked at this issue closely
> until then as I was certain to use Hadoop 0.21 from the beginning.
> Someone showed us Todd's presentation at HUG March 2010, and we were
> all agreed that in order to solve this issue, we will need to use
> Hadoop trunk or Cloudera CDH3 including HDFS-200 and related patches.
> Then I came up with a couple of questions:
> 1. On Hadoop 0.20.x (without HDFS-200 patch), I must close HLog to
> make it's entries durable, right? While rolling HLog does this, how
> about region server failure?
> Someone in the discussion tried this senario. He killed (-9) a region
> server process after a few puts. The HLog was read by HMaster before
> it was closed. HMaster couldn't see any entry in the log and simply
> deleted it. So his lost some puts.
> Is this the expected behavior? He used Hadoop 0.20.1 and HBase 0.20.3.
> 2. On Hadoop trunk, I'd prefer not to hflush() every single put, but
> rely on un-flushed replicas on HDFS nodes, so I can avoid the
> performace penalty. Will this still durable? Will HMaster see un-
> flushed appends right after a region server failure?
> Thanks in advance,
> -- 
> 河野 達也
> Tatsuya Kawano (mr.)
> Tokyo, Japan

View raw message