Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: <F404D2F1-3510-457F-8CF1-03167465D802@101tec.com>
References: <11DE171E-F448-45E4-9CC0-3901ECA0BECA@101tec.com>
 <AANLkTikTo2LtHsiBWhi5XBfD4s3oqwfKgJQouoGXgyDx@mail.gmail.com>
 <F404D2F1-3510-457F-8CF1-03167465D802@101tec.com>
From: Ted Dunning <tdunning@maprtech.com>
Date: Tue, 28 Dec 2010 15:01:31 -0800
Message-ID: <AANLkTinopVMhb=UCHgWT=jB4nepQ-3nxFvvGsWOQv7Wq@mail.gmail.com>
Subject: Re: Hadoop RPC call response post processing
To: common-user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=90e6ba53a9f42acd2c0498807097

--90e6ba53a9f42acd2c0498807097
Content-Type: text/plain; charset=ISO-8859-1

Knowing the tenuring distribution will tell a lot about that exact issue.
 Ephemeral collections take on average less than one instruction per
allocation and the allocation itself is generally only a single instruction.
 For ephemeral garbage, it is extremely unlikely that you can beat that.

So the real question is whether you are actually creating so much garbage
that you are over-whelming the collector or whether the data is much longer
lived than it should be. *That* can cause lots of collection costs.

To tell how long data lives, you need to get the tenuring distribution:

-XX:+PrintTenuringDistribution Prints details about the tenuring
distribution to standard out. It can be used to show this threshold and the
ages of objects in the new generation. It is also useful for observing the
lifetime distribution of an application.
On Tue, Dec 28, 2010 at 11:59 AM, Stefan Groschupf <sg@101tec.com> wrote:

> I don't think the problem is allocation but garbage collection.
>

--90e6ba53a9f42acd2c0498807097--