hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: HBase Exceptions on version 0.20.1
Date Tue, 10 Nov 2009 14:25:48 GMT
Hi JG,

I fully agree that administration of Hadoop/HDFS/HBase clusters is an active process. HDFS
and HBase (and mapreduce, etc.) are easily scaled horizontally, but it requires an
admin to actively make available more resources and join them to the cluster in time to meet
demand. We just expect these things because we anticipate terascale or petascale clusters
of many nodes, which require professional administration. 

One can corrupt an e.g. Oracle database if you overcommit memory on the server and the kernel
panics in get_free_page...

However, HBase has opportunities to sense overloading and take self preserving actions that
is is not taking advantage of. We can work toward that. 

But, it will always be the case that in some circumstances without operational care it may
be too late when HBase is aware of a problem for it to effectively manage the situation. 

   - Andy




________________________________
From: Jonathan Gray <jlist@streamy.com>
To: hbase-user@hadoop.apache.org
Sent: Tue, November 10, 2009 3:23:20 AM
Subject: Re: HBase Exceptions on version 0.20.1

It's fairly easy to run HDFS into the ground if you eat up all the resources.

It's also fairly easy to run a Linux machine into the ground if you eat up all the resources;
 or just about anything by starving it of CPU.

I don't disagree with a read-only mode if the server is full, but in general believe admins
of any production cluster need to be constantly aware of capacity and usage and plan properly.
 Not just so things don't fill up, but also from a performance POV.

I completely disagree with the statement "any hbase cluster will reach this tipping point
at some point in its lifetime as more and more data is added".  Anyone who continuously adds
data without paying any attention to capacity, then this will happen, but this is the case
with anything that has finite resources.


We should certainly do everything we can to prevent data loss and corruption in all cases,
but also must also set realistic expectations.  One should not assume they can fill their
HBase cluster to the brink and for everything to always be okay (even if this is the case).

JG

stack wrote:
> Agreed.  Please make an issue.
> 
> Meantime, it should be possible to have a cron run a script that checks
> cluster resources from time-to-time -- e.g. how full hdfs is, how much each
> regionserver is carrying -- and when it determines the needle is in the red,
> flip the cluster to be read-only.
> 
> St.Ack
> 
> On Mon, Nov 9, 2009 at 9:25 AM, elsif <elsif.then@gmail.com> wrote:
> 
>> The larger issue here is that any hbase cluster will reach this tipping
>> point at some point in its lifetime as more and more data is added.  We
>> need to have a graceful method to put the cluster into safe mode until
>> more resources can be added or the load on the cluster has been
>> reduced.  We cannot allow hbase to run itself into the ground causing
>> data loss or corruption under any circumstances.
>> *
>> *
>> Andrew Purtell wrote:
>>> You should consider provisioning more nodes to get beyond this ceiling
>> you encountered.
>>> DFS write latency spikes from 3 seconds to 6 seconds, to 15! Flushing
>> cannot happen fast enough to avoid an OOME. Possibly there was even
>> insufficient CPU to GC. The log entries you highlighted indicate the load
>> you are exerting on your current cluster needs to be spread out over more
>> resources than currently allocated.
>>> This:
>>> 
>>>> 2009-11-06 09:15:37,144 WARN org.apache.hadoop.hbase.util.Sleeper: We
>> slept 286007ms, ten times longer than scheduled: 10000
>>> indicates a thread that wanted to sleep for 10 seconds was starved for
>> CPU for 286 seconds. Obviously Zookeeper timeouts and resulting HBase
>> process shutdowns, missed DFS heartbeats possibly resulting in spurious
>> declaration of dead datanodes, and other serious problems will result from
>> this.
>>> Did your systems start to swap?
>>> 
>>> When region servers shut down, the master notices this and splits their
>> HLogs into per region reconstruction logs. These are the "oldlogfile.log"
>> files. The master log will shed light on why this particular reconstruction
>> log was botched. Would have happened at the master. The region server
>> probably did do a clean shutdown. I suspect DFS was in extremis due to
>> overloading so the split failed. The checksum error indicates incomplete
>> write at the OS level. Did a datanode crash?
>>> HBASE-1956 is about making the DFS latency metric exportable via the
>>> Hadoop metrics layer, perhaps via Ganglia. Write latency above 1 or 2
>>> seconds is a warning. Anything above 5 seconds is an alarm.  It's a
>>> good indication that an overloading condition is in progress.
>>> 
>>> The Hadoop stack, being pre 1.0, has some rough edges. Response to
>> overloading is one of them. For one thing, HBase could be better about
>> applying backpressure to writing clients when the system is under stress. We
>> will get there. HBASE-1956 is a start.
>>>     - Andy
>>> 
>>> 
>> 
> 



      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message