Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: domain of saint.ack@gmail.com designates
 209.85.161.41 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:sender:in-reply-to:references:date
         :x-google-sender-auth:message-id:subject:from:to:content-type
         :content-transfer-encoding;
        b=mOP7c+RPY748niwzFyJYzpiMUGI2O5Il0y2/1e4yffDl0fT4fi9euPgHvqySjiNxF8
         eNKMKPQZSdxouufoT5x82XPAqS3VrivFDCoGbQUzDeRVS8nw5878IaJqajP+k6wwQDR7
         E5nERsbskvJ6c65zsDGbpyAQQVZZjy+ke0MJ8=
MIME-Version: 1.0
Sender: saint.ack@gmail.com
In-Reply-To: <AANLkTikxq7EhMrGRCajvHe=oTfud=fEZeSp20kv2EA_9@mail.gmail.com>
References: <AANLkTintmQHOXk-ukwC4h7GtSAimJw-Uk5YiGMcsojqd@mail.gmail.com>
	<AANLkTinnmtB_LU9rEEPXhVLtm0q-9DYVQjjL+p-zraNi@mail.gmail.com>
	<AANLkTinBVeCzt4M5EWg9GzeM_mXb+=szZFXK85h7V9qh@mail.gmail.com>
	<AANLkTinWD9Gfg=DVPLEzLKV1Ti8sp7kYO2wSdYEs_Dzr@mail.gmail.com>
	<AANLkTikxq7EhMrGRCajvHe=oTfud=fEZeSp20kv2EA_9@mail.gmail.com>
Date: Mon, 3 Jan 2011 14:44:28 -0800
Message-ID: <AANLkTi=EQAbcu4oUxHD+z4r+cF3LceQBKsS+6g-Me8Pc@mail.gmail.com>
Subject: Re: CMF & NodeIsDeadException
From: Stack <stack@duboce.net>
To: user@hbase.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Mon, Jan 3, 2011 at 2:13 PM, Wayne <wav100@gmail.com> wrote:
> Here are the new settings we are trying out. They seemed to "help" with
> cass. In the end I assume we will need a script to do rolling restarts or
> better yet hbase does it on its own!!
>
> Thanks for the help!
>
> =A0 =A0 =A0 =A0-XX:+UseCMSInitiatingOccupancyOnly
> =A0 =A0 =A0 =A0-XX:CMSInitiatingOccupancyFraction=3D60


This seems low.  Means lots of CPU spent GC'ing.  But that said, good
to start low then you can work up from there.


> =A0 =A0 =A0 =A0-XX:+CMSParallelRemarkEnabled
> =A0 =A0 =A0 =A0-XX:SurvivorRatio=3D8
> =A0 =A0 =A0 =A0-XX:NewRatio=3D3

This is fine to start with but if it were me, I'd make the young gen
bigger (if objects don't make it up into the tenured heap, they'll not
get in the way of subsequent promotions).  What proportion of heap was
it when you had long pauses?


> =A0 =A0 =A0 =A0-XX:MaxTenuringThreshold=3D1
>

Setting this to 1 means stuff objects get promoted to tenured heap
after surviving only one young GC.  I wonder if you set this to a
higher number how things would run?  (Again, my rationale is that if
objects don't get into the tenured space in the first place, then they
can't be in the way when comes time to promote subsequent objects from
young to tenured.)  It might be something to mess with later.

GC tuning, the "joy of java", is a little bit of a black art.  Its
particularly black given that a bunch think there is no tuning that
will get you away from an occasional stop-the-world GC, at least when
running the CMS collector.

Keep us posted.
St.Ack


> On Mon, Jan 3, 2011 at 5:05 PM, Stack <stack@duboce.net> wrote:
>
>> On Mon, Jan 3, 2011 at 12:50 PM, Wayne <wav100@gmail.com> wrote:
>> > We have an 8GB heap. What should newsize be? I just had another node d=
ie
>> > hard after going into a CMF storm. I swear it had solid CMFs 30+ in a
>> row.
>> >
>>
>> Did a full stop-the-world GC run in between? =A0It should have cleaned
>> up fragmentation.
>>
>> > I have no idea what eden space is or how to see what it is. ??
>> >
>>
>> Sorry. =A0There's a bunch of 'cute' terms used for describing the two
>> heap areas in the JVM. =A0Basically, new stuff goes into the 'new' or
>> 'eden' area first. =A0If it sticks around through N (configurable) GCs,
>> it gets promoted to old or tenured generation (there are other names
>> for these notions of young and old). =A0The garbage collection
>> algorithms done in the two heaps differ. =A0See the Ted citation for
>> more on the gruesome details (though come up to a newer version of
>> that doc). =A0The JVM is supposed to work ergonomically but it just
>> ain't smart enough dealing w/ HBase/Cass loadings it seems (e.g. it
>> keeps growing the new/eden space pathologically it would seem).
>>
>>
>> > Not knowing what else to do I will start using some of the Cassandra
>> > settings I used to improve it by setting the occupancy fraction. Any
>> other
>> > ideas???
>> >
>>
>> Which config. you talking of? -XX:+CMSInitiatingOccupancyFraction?
>> Thats a good one to toggle down from defaullts. =A0Should help put off
>> promotion failures a while.
>>
>>
>> St.Ack
>>
>