hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: HBase Failing on Large Loads
Date Fri, 12 Jun 2009 03:02:25 GMT
What are you vm/gc settings?  Let's tune that!

On Jun 11, 2009 7:08 PM, "Bradford Stephens" <bradfordstephens@gmail.com>
wrote:

OK, so I discovered the ulimit wasn't changed like I thought it was,
had to fool with PAM in Ubuntu.

Everything's running a little better, and I cut the data size by 66%.

It took a while, but one of the machines with only 2 cores failed, and
I caught it in the moment. Then 2 other machiens failed a few minutes
later in a cascade. I'm thinking that HBase +Hadoop takes up so much
proc time that the machine gradually stops responding to heartbeat....
does that seem rational?

Here's the first regionserver log: http://pastebin.com/m96e06fe
I wish I could attach the log of one of the regionservers that failed
a few minutes later, but it's 708MB! Here's some examples of the tail:

 2009-06-11 19:00:18,418 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report
to master for 906196 milliseconds - retrying
2009-06-11 19:00:18,419 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: error getting
store file index size for 944890031/url:
java.io.FileNotFoundException: File does not exist:
hdfs://dttest01:54310/hbase-0.19/joinedcontent/944890031/url/mapfiles/2512503149715575970/index

The HBase Master log is surprisingly quiet...

Overall, I think HBase just isn't happy on a machine with two
single-core procs, and when they start dropping like flies, everything
goes to hell. Do my log files support this?

Cheers,
Bradford

On Wed, Jun 10, 2009 at 4:01 PM, Ryan Rawson<ryanobjc@gmail.com> wrote: >
Hey, > > Looks lke you h...

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message