Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7C7F410487 for ; Mon, 14 Oct 2013 08:53:17 +0000 (UTC) Received: (qmail 82550 invoked by uid 500); 14 Oct 2013 08:53:07 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 82140 invoked by uid 500); 14 Oct 2013 08:53:06 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 82132 invoked by uid 99); 14 Oct 2013 08:53:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Oct 2013 08:53:06 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jayamviswanathan@gmail.com designates 209.85.160.52 as permitted sender) Received: from [209.85.160.52] (HELO mail-pb0-f52.google.com) (209.85.160.52) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Oct 2013 08:53:02 +0000 Received: by mail-pb0-f52.google.com with SMTP id wz12so7036396pbc.11 for ; Mon, 14 Oct 2013 01:52:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=j63jbbASbFew4tGBdvjqy/ys/agPsxSE3k2I2msShuA=; b=bfxqjrCVMrkvQTr61b+YJ6yHQGP7JPIM18ncR8m/AAhzH15FSGRIYBYwPA7r4ipQMK RVybmLpdzYC9jYCL2K4UiSk0tYpSG0XQYfFbAS8qxbBqLl1G1majhy4QHkHo4Gt6OT25 tMN5A8UP60R8lVydPPXYVv+kQ2E/9zuXXa7b7pz5XDRDNar7C9lvEfxIQB7GpBX8jja9 Xb8L5CUZBudmKHOon1jjXzLznlL3A+zJZ7JrJav2YpLv0rmt9SuQg5jmXOBaLs2KwoLL jfQBg3LLeKdqWiajs4Gs6ZCdbgc5i0nqrlYx3D58ZM9gRO+HBmpxLwbGuc71xqezfkNQ 8INQ== MIME-Version: 1.0 X-Received: by 10.66.102.100 with SMTP id fn4mr36731536pab.71.1381740762000; Mon, 14 Oct 2013 01:52:42 -0700 (PDT) Received: by 10.70.130.229 with HTTP; Mon, 14 Oct 2013 01:52:41 -0700 (PDT) Received: by 10.70.130.229 with HTTP; Mon, 14 Oct 2013 01:52:41 -0700 (PDT) In-Reply-To: References: Date: Mon, 14 Oct 2013 14:22:41 +0530 Message-ID: Subject: Re: Hadoop Jobtracker heap size calculation and OOME From: Viswanathan J To: cdh-user@cloudera.org, antwnis@gmail.com Cc: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=047d7bd9036c4a86b204e8af9785 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bd9036c4a86b204e8af9785 Content-Type: text/plain; charset=ISO-8859-1 Hi guys, Appreciate your response. Thanks, Viswa.J On Oct 12, 2013 11:29 PM, "Viswanathan J" wrote: > Hi Guys, > > But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version as > per the hadoop release notes as below. > > Please check this URL, > > https://issues.apache.org/jira/browse/MAPREDUCE-5351 > > How come the issue still persist? I'm I asking a valid thing. > > Do I need to configure anything our I missing anything. > > Please help. Appreciate your response. > > Thanks, > Viswa.J > On Oct 12, 2013 7:57 PM, "Viswanathan J" > wrote: > >> Thanks Antonio, hope the memory leak issue will be resolved. Its really >> nightmare every week. >> >> In which release this issue will be resolved? >> >> How to solve this issue, please help because we are facing in production >> environment. >> >> Please share the configuration and cron to do that cleanup process. >> >> Thanks, >> Viswa >> On Oct 12, 2013 7:31 PM, "Antonios Chalkiopoulos" >> wrote: >> >>> "After restart the JT, within a week getting OOME." >>> >>> Viswa, we were having the same issue in our cluster as well - roughly >>> every 5-7 days getting OOME. >>> The heap size of the Job Tracker was constantly increasing due to a >>> memory leak that will hopefully be fixed in newest releases. >>> >>> There is a configuration change in the JobTracker that will disable a >>> functionality regarding cleaning up staging files i.e. >>> /user/build/.staging/* - but that means that you will have to handle the >>> staging files through a cron / jenkins task >>> >>> I'll get you the configuration on Monday.. >>> >>> On Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote: >>>> >>>> Hi, >>>> >>>> I'm running a 14 nodes of Hadoop cluster with datanodes,tasktrackers >>>> running in all nodes. >>>> >>>> *Apache Hadoop :* 1.2.1 >>>> >>>> It shows the heap size currently as follows: >>>> >>>> *Cluster Summary (Heap Size is 5.7/8.89 GB)* >>>> * >>>> * >>>> In the above summary what is the *8.89* GB defines? Is the *8.89*defines maximum heap size for Jobtracker, if yes how it has >>>> been calculated. >>>> >>>> Hope *5.7* is currently running jobs heap-size, how it is calculated. >>>> >>>> Have set the jobtracker default memory size in hadoop-env.sh >>>> >>>> *HADOOP_HEAPSIZE="1024"* >>>> * >>>> * >>>> Have set the mapred.child.java.opts value in mapred-site.xml as, >>>> >>>> >>>> mapred.child.java.opts >>>> -Xmx2048m >>>> >>>> >>>> Even after setting the above property, getting Jobtracker OOME issue. >>>> How the jobtracker memory gradually increasing. After restart the JT, >>>> within a week getting OOME. >>>> >>>> How to resolve this, it is in production and critical? Please help. >>>> Thanks in advance. >>>> >>>> -- >>>> Regards, >>>> Viswa.J >>>> >>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "CDH Users" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to cdh-user+unsubscribe@cloudera.org. >>> For more options, visit >>> https://groups.google.com/a/cloudera.org/groups/opt_out. >>> >> --047d7bd9036c4a86b204e8af9785 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

Hi guys,

Appreciate your response.

Thanks,
Viswa.J

On Oct 12, 2013 11:29 PM, "Viswanathan J&qu= ot; <jayamviswanathan@gmai= l.com> wrote:

Hi Guys,

But I can see the jobtracker OOME issue fixed in hadoop - 1.2.1 version = as per the hadoop release notes as below.

Please check this URL,

https://issues.apache.org/jira/browse/MAPREDUCE-5351

How come the issue still persist? I'm I asking a valid thing.

Do I need to configure anything our I missing anything.

Please help. Appreciate your response.

Thanks,
Viswa.J

On Oct 12, 2013 7:57 PM, "Viswanathan J&quo= t; <jaya= mviswanathan@gmail.com> wrote:

Thanks Antonio, hope the memory leak issue will be resolved. Its really = nightmare every week.

In which release this issue will be resolved?

How to solve this issue, please help because we are facing in production= environment.

Please share the configuration and cron to do that cleanup process.

Thanks,
Viswa

On Oct 12, 2013 7:31 PM, "Antonios Chalkiop= oulos" <antw= nis@gmail.com> wrote:
"After restart the JT, within a week getting OOME.&qu= ot;

Viswa, we were having the same issue in our cluster = as well - roughly every 5-7 days getting OOME.
The heap size of t= he=A0Job Tracker=A0was constantly increasing due to a memory leak that will= hopefully be fixed in newest releases.

There is a configuration change in the JobTracker that = will disable a functionality regarding cleaning up staging files i.e.
=
/user/build/.staging/* - but that means that you will have to handle t= he staging files through a cron / jenkins task

I'll get you the configuration on Monday..

O= n Friday, 11 October 2013 18:08:55 UTC+1, Viswanathan J wrote:
Hi,

I'm running a 14 nodes of Hadoop c= luster with datanodes,tasktrackers running in all nodes.

Apache Hadoop : 1.2.1

It shows the heap size currently as follows:

Cluster Summary (Heap Size is 5.7/8.89 = GB)

In the above summary what = is the 8.89 GB defines? Is the 8.89 defines maximum heap size= for Jobtracker, if yes how it has been=A0calculated.=A0

Hope 5.7 is currently running jobs heap-size, how = it is=A0calculated.

Have set the jobtracker default memory size in hadoop-env.sh

HADOOP_HEAPSIZE=3D"1024"

= Have set the mapred.child.java.opts value in mapred-site.xml as,

=
<property>
=A0 <name>mapred.child.java.opts</= name>
=A0 <value>-Xmx2048m</value>
</property>

Even after setting the= above property, getting Jobtracker OOME issue. How the jobtracker memory g= radually increasing. After restart the JT, within a week getting OOME.

How to resolve this, it is in production and critical? = Please help. Thanks in advance.

--
Regards,
Visw= a.J

--
=A0
---
You received this message because you are subscribed to the Google Groups &= quot;CDH Users" group.
To unsubscribe from this group and stop receiving emails from it, send an e= mail to cdh-user+unsubscribe@cloudera.org.
For more options, visit https://groups.google.com/a/cloudera.org= /groups/opt_out.
--047d7bd9036c4a86b204e8af9785--