Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9ED1010BE8 for ; Thu, 24 Oct 2013 01:00:22 +0000 (UTC) Received: (qmail 58801 invoked by uid 500); 24 Oct 2013 01:00:17 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 58576 invoked by uid 500); 24 Oct 2013 01:00:17 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 58569 invoked by uid 99); 24 Oct 2013 01:00:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Oct 2013 01:00:17 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of owenzhang1990@gmail.com designates 209.85.160.42 as permitted sender) Received: from [209.85.160.42] (HELO mail-pb0-f42.google.com) (209.85.160.42) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Oct 2013 01:00:13 +0000 Received: by mail-pb0-f42.google.com with SMTP id jt11so1452118pbb.15 for ; Wed, 23 Oct 2013 17:59:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=/dhrEPa136V0pb6aWMtu8oZ4muIB9H91I02ZrsL7CA8=; b=Qu2Lg73y0aS/FKsGRAq15/Nq7ET27FyNrYwSlYHO5JtB1fqy2Hp0LS/eIcIA4jGT46 zb5Eq+AHy/xmDNb/V+9GkaWRgCsp0XN2oCfR9Gy9E8kfCYVkXuBjJVV0NpM2V5U+Ad4i NBm09AZ/Isjj9Dw4sjgzBBwq70tMxLdviAlZJEsZa4pXxA6bPpZi4djs6LDiQ/eauFdu PLlUwctNhfzRDBQ/kHrWV1+OQJMAA5Nd8j16wHhnHsYxvbz1gAKJn1dlZprk/pI+iaFh BwqvtOn7E/NMyXxAoMnJBcO3A2PysziYW3Zo811ETNOPbHgXyQeXbVR0+zh0loRxCyLR Dk6w== MIME-Version: 1.0 X-Received: by 10.66.189.98 with SMTP id gh2mr612137pac.60.1382576393151; Wed, 23 Oct 2013 17:59:53 -0700 (PDT) Received: by 10.68.11.2 with HTTP; Wed, 23 Oct 2013 17:59:53 -0700 (PDT) In-Reply-To: <1382550048.43545.YahooMailNeo@web141203.mail.bf1.yahoo.com> References: <1382550048.43545.YahooMailNeo@web141203.mail.bf1.yahoo.com> Date: Thu, 24 Oct 2013 08:59:53 +0800 Message-ID: Subject: Re: map container is assigned default memory size rather than user configured which will cause TaskAttempt failure From: Manu Zhang To: user@hadoop.apache.org, Ravi Prakash Content-Type: multipart/alternative; boundary=047d7bf0e048c9ed2604e97226b3 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bf0e048c9ed2604e97226b3 Content-Type: text/plain; charset=ISO-8859-1 Thanks Ravi. I do have mapred-site.xml under /etc/hadoop/conf/ on those nodes but it sounds weird to me should they read configuration from those mapred-site.xml since it's the client who applies for the resource. I have another mapred-site.xml in the directory where I run my job. I suppose my job should read conf from that mapred-site.xml. Please correct me if I am mistaken. Also, not always the same nodes. The number of failures is random, too. Anyway, I will have my settings in all the nodes' mapred-site.xml and see if the problem goes away. Manu On Thu, Oct 24, 2013 at 1:40 AM, Ravi Prakash wrote: > Manu! > > This should not be the case. All tasks should have the configuration > values you specified propagated to them. Are you sure your setup is > correct? Are they always the same nodes which run with 1024Mb? Perhaps you > have mapred-site.xml on those nodes? > > HTH > Ravi > > > On Tuesday, October 22, 2013 9:09 PM, Manu Zhang < > owenzhang1990@gmail.com> wrote: > Hi, > > I've been running Terasort on Hadoop-2.0.4. > > Every time there is s a small number of Map failures (like 4 or 5) because > of container's running beyond virtual memory limit. > > I've set mapreduce.map.memory.mb to a safe value (like 2560MB) so most > TaskAttempt goes fine while the values of those failed maps are the default > 1024MB. > > My question is thus, why a small number of container's memory values are > set to default rather than that of user-configured ? > > Any thoughts ? > > Thanks, > Manu Zhang > > > > --047d7bf0e048c9ed2604e97226b3 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Thanks Ravi.

I do have mapred-site.xml = under /etc/hadoop/conf/ on those nodes but it sounds weird to me should the= y read configuration from those mapred-site.xml since it's the client w= ho applies for the resource. I have another mapred-site.xml in the director= y where I run my job. I suppose my job should read conf from that mapred-si= te.xml. Please correct me if I am mistaken.

Also, not always the same nodes. The number of failures= is random, too.=A0

Anyway, I will have my setting= s in all the nodes' mapred-site.xml and see if the problem goes away.

Manu


On Thu, Oct 24, 2013 at 1:40 AM, Ravi Prakash <ravi= hoo@ymail.com> wrote:
Manu!

This should not be the case. All tasks should have the configuration = values you specified propagated to them. Are you sure your setup is correct= ? Are they always the same nodes which run with 1024Mb? Perhaps you have ma= pred-site.xml on those nodes?

HTH
Ravi
=

On Tuesday, October 22, 2013 9:09 PM, Manu Zhang <owenzhang1990@gmail.com> = wrote:
Hi,=A0

I've been running Terasort on Hadoop-2.0.4.

Every time there is s a small number of M= ap failures (like 4 or 5) because of container's running beyond virtual= memory limit.=A0

I've set mapreduce.map.memory.mb to a safe value (l= ike 2560MB) so most TaskAttempt goes fine while the values of those failed = maps are the default 1024MB.

My question is thus, = why a small number of container's memory values are set to default rath= er than that of user-configured ?

Any thoughts ?

Thanks,
Manu Zhang




--047d7bf0e048c9ed2604e97226b3--