hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: State of HA
Date Mon, 01 Jun 2009 20:56:34 GMT
Hey,

Stack is saying that for HADOOP-4379, it fails 1/5th of the time - recovery
takes more than 15 minutes, aka potentially unlimited amount of time.  That
patch relies on lease recovery it seems, so it may not be the final answer
for us.

Now, on the subject of the rest of things, under Zookeeper we are doing a
much better job at HA.  Regionserver crashes are detect significantly faster
than the 2 minute lease timeout, with my fixes you can take down any
regionserver without getting 'stuck' with an unassigned ROOT/META
(previously a problem).

I have noticed on trunk I can kill and restart the master w/o taking down
the cluster.  During master start-up it does a fairly good job at detecting
node status and otherwise recovering.  I can't say about master elections
exactly yet.

The HA story is shaping up nicely.

To end on a sour note, HDFS Namenode is still a SPOF.  When we're done with
HBase 0.20 it should be the only SPOF.

-ryan

On Mon, Jun 1, 2009 at 1:50 PM, <andrew.wharton@nokia.com> wrote:

> I am trying to parse this: are you implying that I can expect a 20% ("1 out
> of 5 or so") success getting HA to work with this code?
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of ext
> stack
> Sent: 01 June, 2009 13:27
> To: hbase-user@hadoop.apache.org
> Subject: Re: State of HA
>
> You can pull TRUNK and try it with HADOOP-4379.
>
> The master failover works as J-D suggests.  It needs some polish but thats
> on its way.  The HADOOP-4379 will get you a sync that works most of the
> time
> (1 out of 5 or so in my testing) but hopefully that'll be addressed soon
> too.  You'll also need HBASE-1470.   Its the bit of code that exploits
> HADOOP-4379 when configuration is set right).
>
> If you need help setting up stuff, you know where to find us.  Issues we
> want to hear about because we're hoping to tell the above as part of our
> 0.20.0 release story.
>
> Yours,
> St.Ack
>
> On Mon, Jun 1, 2009 at 7:59 AM, <andrew.wharton@nokia.com> wrote:
>
> > Hello,
> >
> > I have been looking at Jira and trying to get a current snapshot of the
> > state of HA for HBase/Hadoop? I know that the zookeeper integration is
> the
> > core of the HA story, but when is that slated for a "stable" debut? Is
> there
> > anything that is currently in svn that we can pull and test?
> >
> > TIA,
> >
> > Andrew
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message