Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hbase-user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of ryanobjc@gmail.com designates
 209.85.222.200 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=VcBlS4+JaTpt9pEP51mOUo6qoxeqtnG/YLnp+UZvSLSHA8LO16RUizX/1Om5A/42+/
         YhdhIwHDrMPDsOCmuVdHBdbFfa12rQwEg1K/nHacadk7YXRKZJ5gp/kj/HUVJ7oTmuHH
         ERHPMpYdAm/GAklsPMwAbz0fE3M39HJ5eemvk=
MIME-Version: 1.0
In-Reply-To: <860544ed0906112225o63d76025jafb8efa31f09967e@mail.gmail.com>
References: <860544ed0906091013k6dc054cfm3c8e52d8b52fdc6c@mail.gmail.com>
	 <7c962aed0906100040q609ed73cyf7911a489c2a7d1e@mail.gmail.com>
	 <860544ed0906101450s469d57f3h698bfd67a6099165@mail.gmail.com>
	 <78568af10906101454l225161al1f149d12d598c303@mail.gmail.com>
	 <78568af10906101455x3c531637n829bc12987934661@mail.gmail.com>
	 <860544ed0906101532v4e14cda7p300fed6bed3fa18a@mail.gmail.com>
	 <78568af10906101601w6c1f3896kf1a43fd0990d78f@mail.gmail.com>
	 <860544ed0906111907m2e698d25i6c0c65b7b3b549d5@mail.gmail.com>
	 <78568af10906112002j2fed1e63h8b840e4b10f1aa9a@mail.gmail.com>
	 <860544ed0906112225o63d76025jafb8efa31f09967e@mail.gmail.com>
Date: Thu, 11 Jun 2009 22:29:30 -0700
Message-ID: <78568af10906112229j411dd480ve1635a1e0c0b5001@mail.gmail.com>
Subject: Re: HBase Failing on Large Loads
From: Ryan Rawson <ryanobjc@gmail.com>
To: hbase-user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=00163645883e25fd33046c1ffe2f

--00163645883e25fd33046c1ffe2f
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Since you are on a 2-4 cpu system, you need to use:

"-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode"

What do your gc verbose log say?  are you getting huge pauses?

you can up the ZK, try doing this in zoo.conf server and client:

tickTime=20000
initLimit=5
syncLimit=2

and in hbase-site.xml:
<property>
<name>zookeeper.session.timeout</name>
<value>60000</value>
</property>

This will give you a much higher zookeeper time out.

Let us know!


On Thu, Jun 11, 2009 at 10:25 PM, Bradford Stephens <
bradfordstephens@gmail.com> wrote:

> Thanks for helping me, o people of awesomeness.
>
> VM settings are 1000 for HBase, and I used the GC laid out in the
> Wiki. Also, " -server " ... basically, I did everything here :
> http://wiki.apache.org/hadoop/PerformanceTuning , and on
>
> http://ryantwopointoh.blogspot.com/2009/01/performance-of-hbase-importing.html
>
> On Thu, Jun 11, 2009 at 8:02 PM, Ryan Rawson<ryanobjc@gmail.com> wrote:
> > What are you vm/gc settings?  Let's tune that!
> >
> > On Jun 11, 2009 7:08 PM, "Bradford Stephens" <bradfordstephens@gmail.com
> >
> > wrote:
> >
> > OK, so I discovered the ulimit wasn't changed like I thought it was,
> > had to fool with PAM in Ubuntu.
> >
> > Everything's running a little better, and I cut the data size by 66%.
> >
> > It took a while, but one of the machines with only 2 cores failed, and
> > I caught it in the moment. Then 2 other machiens failed a few minutes
> > later in a cascade. I'm thinking that HBase +Hadoop takes up so much
> > proc time that the machine gradually stops responding to heartbeat....
> > does that seem rational?
> >
> > Here's the first regionserver log: http://pastebin.com/m96e06fe
> > I wish I could attach the log of one of the regionservers that failed
> > a few minutes later, but it's 708MB! Here's some examples of the tail:
> >
> >  2009-06-11 19:00:18,418 WARN
> > org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report
> > to master for 906196 milliseconds - retrying
> > 2009-06-11 19:00:18,419 WARN
> > org.apache.hadoop.hbase.regionserver.HRegionServer: error getting
> > store file index size for 944890031/url:
> > java.io.FileNotFoundException: File does not exist:
> >
> hdfs://dttest01:54310/hbase-0.19/joinedcontent/944890031/url/mapfiles/2512503149715575970/index
> >
> > The HBase Master log is surprisingly quiet...
> >
> > Overall, I think HBase just isn't happy on a machine with two
> > single-core procs, and when they start dropping like flies, everything
> > goes to hell. Do my log files support this?
> >
> > Cheers,
> > Bradford
> >
> > On Wed, Jun 10, 2009 at 4:01 PM, Ryan Rawson<ryanobjc@gmail.com> wrote:
> >
> > Hey, > > Looks lke you h...
> >
>

--00163645883e25fd33046c1ffe2f--