hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (POWERSET)" <Jim.Keller...@microsoft.com>
Subject RE: thinking about hbase 0.20
Date Fri, 03 Apr 2009 00:53:33 GMT
sync() is not good enough nor is syncFS(). What we need is HADOOP-4379.
However, the current patch does not recover the (HDFS file) lease properly.

Stack and I cannot contribute to Hadoop, but if someone else in hbase-dev
wants to help Dhruba out, I'm sure he'd welcome contributions. Be warned,
however, that if you haven't ventured into the depths of the namenode
and datanode, it's *really* complicated.

---
Jim Kellerman, Powerset (Live Search, Microsoft Corporation)


> -----Original Message-----
> From: Ryan Rawson [mailto:ryanobjc@gmail.com]
> Sent: Thursday, April 02, 2009 5:42 PM
> To: hbase-dev@hadoop.apache.org
> Subject: Re: thinking about hbase 0.20
> 
> I want to talk about sync() in HDFS for a bit...
> 
> I had a cluster crash, OOMEs out the butt, 17/19 machines were dead when I
> got to the scene.
> 
> What I found was in .META. there were 2-3x as many regions as were
> actually
> on disk.  Tons of older entries from parent splits. Looks like a bunch of
> updates and deletes weren't persisted.  And by a bunch, I mean a SHIT TON.
> It was insane.  I had to write HbaseFsck.java as an experiment to recover
> without rm -rf /hbase
> 
> So, what will be in hadoop-0.20 to minimize this kind of horrible data
> loss?
> 
> Is this the 'sync()' call that is on-again-off-again reliable?
> 
> What about append?  Do we really need append?  Syncing an open file to
> persist data is good enough, no?
> 
> -ryan
> 
> On Thu, Apr 2, 2009 at 5:34 PM, Jim Kellerman (POWERSET) <
> Jim.Kellerman@microsoft.com> wrote:
> 
> > > -----Original Message-----
> > > From: Erik Holstad [mailto:erikholstad@gmail.com]
> > > Sent: Thursday, April 02, 2009 5:09 PM
> > > To: hbase-dev@hadoop.apache.org
> > > Subject: Re: thinking about hbase 0.20
> > >
> > > So the way I see it, from our point of view, we can probably get 0.20
> out
> > > the door a week after that meeting, so maybe a week and a half after
> > Stack
> > > gets back.
> >
> > We still have to wait for hadoop-0.20 which has no release candidate
> yet.
> > However pushing tasks out is still a good idea so that we can spend the
> > time between hadoop-0.20 release candidate and hbase-0.20 fixing issues
> > which I'm certain we will find. All in all this should result in a more
> > timely and stable release for hbase-0.20.
> >
> > -Jim
> >

Mime
View raw message