hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: HBase operations
Date Mon, 04 Jul 2011 18:19:42 GMT
HBCK can give you false positives while regions are in transition, during normal operations
such as splits. Stack could probably say more here. 

I don't know firsthand but I would assume that Facebook starts with HBCK results and then
has some additional process for determining if something is really wrong. 

I think with a little effort it can be taught to be smart enough that it could become the
basis for a Nagios sensor or what have you for widespread use.

   - Andy

----- Original Message -----
> From: Joseph Pallas <joseph.pallas@oracle.com>
> To: user@hbase.apache.org
> Cc: 
> Sent: Sunday, July 3, 2011 10:13 PM
> Subject: HBase operations
> One of the really useful things about the Hadoop Summit and HBase meetup was 
> hearing about what people are doing to manage, monitor and maintain the health 
> of their systems.
> So, I was just reading the Facebook SIGMOD 2011 paper 
> <http://borthakur.com/ftp/RealtimeHadoopSigmod2011.pdf>, and I came across 
> this line: “Nowadays we run HBCK almost continuously against our production 
> clusters to catch problems as early as possible.”
> Does anyone else do this?  Can anyone from Facebook comment on what sort of 
> problems you've caught this way?  My naïve thought is that hbck would be 
> recommended after some kind of notable failure but I wouldn't have thought 
> it likely to turn up problems if run routinely.  Maybe my perspective will be 
> different when I get a real cluster going instead of a virtual one.  It would be 
> nice to know what to watch out for.
> Thanks.
> joe

View raw message