Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 20493D226 for ; Mon, 24 Dec 2012 03:39:35 +0000 (UTC) Received: (qmail 86972 invoked by uid 500); 24 Dec 2012 03:39:29 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 86881 invoked by uid 500); 24 Dec 2012 03:39:29 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 86870 invoked by uid 99); 24 Dec 2012 03:39:29 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Dec 2012 03:39:29 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of junior.minto.0@gmail.com designates 209.85.216.172 as permitted sender) Received: from [209.85.216.172] (HELO mail-qc0-f172.google.com) (209.85.216.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 24 Dec 2012 03:39:22 +0000 Received: by mail-qc0-f172.google.com with SMTP id b25so3578723qca.3 for ; Sun, 23 Dec 2012 19:39:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=kas7+7Ah9iKTMusjKOPJP3NpfOeifSoCJl0oDr7zS6E=; b=uFxHCrBV+zi7jdIS54e9Ia1w326lYHexzRG21nVJHK9aA7CGRdfK7AtRTQiDan+oSs RrX++Ye4e0xTp8/C+usUOJvZgUxpVXHzcUF0L4hCzlOtSpB5RgiJ6qPtiXmutj0VeMRm bwduUgAxw5CkfvOqushdeS9FWIHrQIcWl6OM4sw/L4XOXnufNbK7qKMwwJBCUnmXJCma fAhXJZgvY8/eHemc8P/TiekrIfgEiIonTWUpVcXdEGtf5Xvtm9RO0Itg6Aw92dMPQk2g 69iBSL4gqkbe0ncrWmxoYasTIYQSjQWTu5vEa1MhAmeGTk4/yZnMnuLlPpDG0LP+xWOJ Zh6Q== MIME-Version: 1.0 Received: by 10.49.127.145 with SMTP id ng17mr11686395qeb.25.1356320341669; Sun, 23 Dec 2012 19:39:01 -0800 (PST) Received: by 10.49.94.239 with HTTP; Sun, 23 Dec 2012 19:39:01 -0800 (PST) In-Reply-To: References: <013c01cddffd$7ee10b10$7ca32130$@yahoo.com> Date: Mon, 24 Dec 2012 11:39:01 +0800 Message-ID: Subject: Re: How to troubleshoot OutOfMemoryError From: Junior Mint To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b6dcee02ab51c04d190f0eb X-Virus-Checked: Checked by ClamAV on apache.org --047d7b6dcee02ab51c04d190f0eb Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable oom=E6=98=AF=E4=BB=80=E4=B9=88=E5=93=88=E5=93=88 On Mon, Dec 24, 2012 at 11:30 AM, =E5=91=A8=E6=A2=A6=E6=83=B3 wrote: > I encountered the OOM problem, because i don't set ulimit open files > limit. It had nothing to do with Memory. Memory is sufficient. > > Best Regards, > Andy > > > 2012/12/22 Manoj Babu > >> David, >> >> I faced the same issue due to too much of logging that fills the task >> tracker log folder. >> >> Cheers! >> Manoj. >> >> >> On Sat, Dec 22, 2012 at 9:10 PM, Stephen Fritz wr= ote: >> >>> Troubleshooting OOMs in the map/reduce tasks can be tricky, see page 11= 8 >>> of Hadoop Operationsfor a couple of settings which could affect the fr= equency of OOMs which >>> aren't necessarily intuitive. >>> >>> To answer your question about getting the heap dump, you should be able >>> to add "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=3D/some/path" = to >>> your mapred.child.java.opts, then look for the heap dump in that path n= ext >>> time you see the OOM. >>> >>> >>> On Fri, Dec 21, 2012 at 11:33 PM, David Parks w= rote: >>> >>>> I=E2=80=99m pretty consistently seeing a few reduce tasks fail with >>>> OutOfMemoryError (below). It doesn=E2=80=99t kill the job, but it slow= s it down. >>>> **** >>>> >>>> ** ** >>>> >>>> In my current case the reducer is pretty darn simple, the algorithm >>>> basically does:**** >>>> >>>> **1. **Do you have 2 values for this key?**** >>>> >>>> **2. **If so, build a json string and emit a NullWritable and >>>> Text value.**** >>>> >>>> ** ** >>>> >>>> The string buffer I use to build the json is re-used, and I can=E2=80= =99t see >>>> anywhere in my code that would be taking more than ~50k of memory at a= ny >>>> point in time.**** >>>> >>>> ** ** >>>> >>>> But I want to verify, is there a way to get the heap dump and all afte= r >>>> this error? I=E2=80=99m running on AWS MapReduce v1.0.3 of Hadoop.**** >>>> >>>> ** ** >>>> >>>> Error: java.lang.OutOfMemoryError: Java heap space**** >>>> >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuff= leInMemory(ReduceTask.java:1711) >>>> **** >>>> >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMa= pOutput(ReduceTask.java:1571) >>>> **** >>>> >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyO= utput(ReduceTask.java:1412) >>>> **** >>>> >>>> at >>>> org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(R= educeTask.java:1344) >>>> **** >>>> >>>> ** ** >>>> >>>> ** ** >>>> >>> >>> >> > --047d7b6dcee02ab51c04d190f0eb Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
oom=E6=98=AF=E4=BB=80=E4=B9=88=E5=93=88=E5=93=88


On Mon, Dec 24, 2= 012 at 11:30 AM, =E5=91=A8=E6=A2=A6=E6=83=B3 <ablozhou@gmail.com>= wrote:
I encountered the OOM problem, because i don= 't set ulimit open files limit. It had nothing to do with Memory. Memor= y is=C2=A0sufficient.

Best Regards,
Andy


<= div class=3D"gmail_quote">2012/12/22 Manoj Babu <manoj444@gmail.com&g= t;
David,

I faced the same issue d= ue to too much of logging that fills the task tracker log folder.

Cheers!
Manoj.


On Sat, Dec 22, 2012 at 9:10 PM, Stephen= Fritz <stephenf@cloudera.com> wrote:
Troubleshooting OOMs in the map/reduce tasks can be tricky, see page 118= of Hadoop Operations for a couple of settings which could affect the f= requency of OOMs which aren't necessarily intuitive.=C2=A0

To answer your question about getting the heap dump, you should be able= to add "-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=3D/some/path= " to your mapred.child.java.opts, then look for the heap dump in that = path next time you see the OOM.


On Fri, Dec 21, 2012 at 11:33 PM, David Park= s <davidparks21@yahoo.com> wrote:

I=E2=80=99m pretty consistently seeing a few reduce tasks fail with Out= OfMemoryError (below). It doesn=E2=80=99t kill the job, but it slows it dow= n.

=C2=A0

In my= current case the reducer is pretty darn simple, the algorithm basically do= es:

1.=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 Do you have 2 values for this key?

2.= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 If so, build a js= on string and emit a NullWritable and Text value.

=C2=A0

The string buffer I use to build the json is re-used= , and I can=E2=80=99t see anywhere in my code that would be taking more tha= n ~50k of memory at any point in time.

=C2=A0

But I want to verify, is there a way to get the heap= dump and all after this error? I=E2=80=99m running on AWS MapReduce v1.0.3= of Hadoop.

=C2=A0

Error: java.lang.OutOfM= emoryError: Java heap space

= =C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$M= apOutputCopier.shuffleInMemory(ReduceTask.java:1711)

= =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.mapred.Redu= ceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1571)

=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$M= apOutputCopier.copyOutput(ReduceTask.java:1412)

=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.mapred.ReduceT= ask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1344)

= =C2=A0

=C2=A0

<= /blockquote>




--047d7b6dcee02ab51c04d190f0eb--