Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of prattrs@adobe.com designates
 64.18.1.191 as permitted sender)
From: Sandy Pratt <prattrs@adobe.com>
To: "user@hbase.apache.org" <user@hbase.apache.org>
Date: Mon, 16 Jul 2012 15:55:03 -0700
Subject: RE: Hmaster and HRegionServer disappearance reason to ask
Thread-Topic: Hmaster and HRegionServer disappearance reason to ask
Thread-Index: Ac1bCa6H6Gcw6UIcQg+ph5jJtasgZgDncu0wAB15S0AAFk2yYAAbdY1AAPARGSA=
Message-ID: 
 <0D0534D89070F347A7ACC0D03FCE696B0720675BA1@NAMBX02.corp.adobe.com>
References: <201207021630247931834@163.com>
 <870C7774-966B-4A8C-9758-0571C603571D@gmail.com>
 <BEA49F2A6463A14ABAF42251D9680FA42E15270C65@EXCHANGE.email.total>
 <1341532480.92715.YahooMailNeo@web192501.mail.sg3.yahoo.com>
 <BEA49F2A6463A14ABAF42251D9680FA42E16D7E46E@EXCHANGE.email.total>
 <004101cd5f1f$66584c20$3308e460$@ch@huawei.com>
 <BEA49F2A6463A14ABAF42251D9680FA42E16D7E792@EXCHANGE.email.total>
 <004a01cd5fe5$ecbbe9e0$c633bda0$@ch@huawei.com>
In-Reply-To: <004a01cd5fe5$ecbbe9e0$c633bda0$@ch@huawei.com>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0

This sounds similar to something I've seen before, but in that case I found=
 the winning GC arguments to be something like

-XX:+UseParallelGC -XX:+UseParallelOldGC -XX:MaxDirectMemorySize=3D128m

(note the old gen parallel compacting collector rather than the ParNew coll=
ector which IIRC is used with concurrent GC by default)

I don't recall the MaxDirectMemorySize on its own preventing massive off-he=
ap memory allocations from piling up.

Just my 2 cents, YMMV.

Sandy

-----Original Message-----
From: Laxman [mailto:lakshman.ch@huawei.com]=20
Sent: Wednesday, July 11, 2012 9:22 PM
To: 'Pablo Musa'; user@hbase.apache.org
Subject: RE: Hmaster and HRegionServer disappearance reason to ask

> > 1) Fix the direct memory usage to a fixed value -
> XX:MaxDirectMemorySize=3D1G
>=20
> This flag should be in RS ou DN?

We need to apply for both but limit can be increased based on your load (Ma=
y be 2G).
Also we can to apply to all processes which are having following symptoms.

1) Allocated heap is few GB (4 to 8)
2) VIRT/RES will occupy double the heap (like 15GB) or even more
3) Long pauses in GC log (allocated heap is just <=3D8GB)
4) Your application uses lot of NIO/RMI calls(Ex: DataNode, RegionServer)

In our cluster we apply for all server processes (NN, DN, HM, RS, JT, TT, Z=
ooKeeper).
Long pauses are disappeared after we set this flag (esp. for DN and RS).
--
Regards,
Laxman