hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anze <anzen...@volja.net>
Subject Re: HBase stability
Date Tue, 14 Dec 2010 09:46:18 GMT

First of all, thank you all for the answers. I appreciate it! 

To recap:
- 0.20.4 is known to be "fragile"
- upgrade to 0.89 (cdh3b3) would improve stability
- GC should be monitored and system tuned if necessary (not sure how to do 
that - yet :)
- memory should be at least 2GB, better 4GB+ (we can't go that far)
- more nodes would help with stability issues

@Jonathan: yes, we are using 2 nodes that run both Hadoop (namenode, sec. 
namenode, datanodes, jobtracker, tasktrackers) and Hbase. The reason is that 
performance-wise we don't need more than that yet, but we have plans to make 
operation much larger in future. So while this is in production, it is really 
a test-case for much larger system. 
However, Hadoop runs reliably, even under pressure (I do understand it is much 
more mature project though). I would expect HBase to be written with the 
mantra "any machine may fail at any time" in mind - and with error recovery in 
that spirit. In out experience, with 0.20.4 this just isn't the case (data 
loss of a few hours' worth of Put()-s is very common when it crashes). But I 
really hope we can make it work reliably, we have put a lot of work in 
building a system around it... We'll see how it goes with 0.89 (fingers 
crossed :). 

Again, thank you all for the answers!


On Monday 13 December 2010, Geoff Hendrey wrote:
> We we're having no end to "buffet" of errors and stability problems with
> 20.3 when we ran big mapreduce jobs to insert data. Upgraded to 20.6
> last week, and have not seen any instability. Just my anecdotal
> experience.
> -geoff
> -----Original Message-----
> From: Anze [mailto:anzenews@volja.net]
> Sent: Monday, December 13, 2010 2:41 AM
> To: user@hbase.apache.org
> Subject: HBase stability
> Hi all!
> We have been using HBase 0.20.4 (cdh3b1) in production on 2 nodes for a
> few
> months now and we are having constant issues with it. We fell over all
> standard traps (like "Too many open files", network configuration
> problems,...). All in all, we had about one crash every week or so.
> Fortunately we are still using it just for background processing so our
> service didn't suffer directly, but we have lost huge amounts of time
> just
> fixing the data errors that resulted from data not being written to
> permanent
> storage. Not to mention fixing the issues.
> As you can probably understand, we are very frustrated with this and are
> seriously considering moving to another bigtable.
> Right now, HBase crashes whenever we run very intensive rebuild of
> secondary
> index (normal table, but we use it as secondary index) to a huge table.
> I have
> found this:
> http://wiki.apache.org/hadoop/Hbase/Troubleshooting
> (see problem 9)
> One of the lines read:
> "Make sure you give plenty of RAM (in hbase-env.sh), the default of 1GB
> won't
> be able to sustain long running imports."
> So, if I understand correctly, no matter how HBase is set up, if I run
> an
> intensive enough application, it will choke? I would expect it to be
> slower
> when under (too much) pressure, but not to crash.
> Of course, we will somehow solve this issue (working on it), but... :(
> What are your experiences with HBase? Is it stable? Is it just us and
> the way
> we set it up?
> Also, would upgrading to 0.89 (cdh3b3) help?
> Thanks,
> Anze

View raw message