hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (POWERSET)" <Jim.Keller...@microsoft.com>
Subject RE: HBase and failure notification
Date Thu, 26 Feb 2009 19:13:20 GMT
> -----Original Message-----
> From: david.vancouvering@gmail.com [mailto:david.vancouvering@gmail.com]
> On Behalf Of David Van Couvering
> Sent: Thursday, February 26, 2009 10:18 AM
> To: hbase-user@hadoop.apache.org
> Subject: HBase and failure notification
> 
> Hey, all.  I'm doing a bit of a survey of distributed key/value stores
> out
> there.  HBase looks pretty interesting, nice to see an open source
> version
> of BigTable out there.
> 
> HBase is obviously clustered, but what I can't figure out is how it does
> cluster management.  It looks like you have to configure it to tell it
> all
> the machines that have region servers, and that implies to me that *you*
> have to start and manage the region servers - HBase doesn't do any of
> that
> for you.

There are start and stop scripts that will start up the master and region
servers.

> So I think that means that it doesn't have any node monitoring
> support - you have to have your own monitoring system that detects
> failed nodes and notifies you and/or restarts them for you.

HBase has a web UI that you can use to monitor the state of the cluster.
The master does detect when a region server becomes unreachable.

But if you mean machine failure, HBase does not have built in monitoring,
but you can use Ganglia to monitor the hardware status. HBase can also
feed metrics to Ganglia.

> 
> Also, the architecture document says "if [the master server] detects a
> HRegionServer is no longer reachable, it will split the HRegionServer's
> write-ahead log so that there is now one write-ahead log for each region
> that the HRegionServer was serving. After it has accomplished this, it
> will
> reassign the regions that were being served by the unreachable
> HRegionServer"
> 
> This seems to imply that even though the HRegionServer is unreachable,
> somehow it's write-ahead log and the regions it was serving are.
> Perhaps I
> don't fully understand HFS, but is this a guarantee when the node
> hosting
> the HRegionServer is down?  What happens if you can't get to the write-
> ahead
> log and/or some of the regions the region server was serving?

HDFS replicates data to multiple machines (3 by default), so unless you 
have a catastrophic outage, it is very unlikely that the data will be
completely unreachable.

> Thanks,
> 
> David
> 
> --
> David W. Van Couvering
> 
> I am looking for a senior position working on server-side Java systems.
>  Feel free to contact me if you know of any opportunities.
> 
> http://www.linkedin.com/in/davidvc
> http://davidvancouvering.blogspot.com
> http://twitter.com/dcouvering
> 
> 
> --
> David W. Van Couvering
> 
> I am looking for a senior position working on server-side Java systems.
>  Feel free to contact me if you know of any opportunities.
> 
> http://www.linkedin.com/in/davidvc
> http://davidvancouvering.blogspot.com
> http://twitter.com/dcouvering

Mime
View raw message