Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AEE0A10D1E for ; Fri, 25 Oct 2013 00:15:45 +0000 (UTC) Received: (qmail 38908 invoked by uid 500); 25 Oct 2013 00:15:40 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 38713 invoked by uid 500); 25 Oct 2013 00:15:40 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 38706 invoked by uid 99); 25 Oct 2013 00:15:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Oct 2013 00:15:39 +0000 X-ASF-Spam-Status: No, hits=2.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of owenzhang1990@gmail.com designates 209.85.192.176 as permitted sender) Received: from [209.85.192.176] (HELO mail-pd0-f176.google.com) (209.85.192.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 25 Oct 2013 00:15:33 +0000 Received: by mail-pd0-f176.google.com with SMTP id g10so3170487pdj.35 for ; Thu, 24 Oct 2013 17:15:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=lmosHEpU8MXNQIidaFhNYXx96+O1iQ97Gffs7qyaZhs=; b=QX+GakHtFiY4C2AsOr/GtWKMG6C3mMixFaWQ04kk5K5tnvMKQ3k6BipovYJXjD9dKM EA6HZCMTYfTozwViTMHn0tlBKc1ODQbfurE3CxjAaSPlVYlBeQWhaGHE71e05SfB97Hm 0xoddjWgenKUzfdEQZuJmLGRuHDhIz6eQLWOsjfssp5eq6POl4xBVTS3QDoRqAZLD04/ piQohFbimtfW9lj5UMBMmoeSSaUf6M+nOiw1s2Y9nV8if4s3AdboZbvSPqEY9iE99SfI wpIrDxmQl1QJXUfDavxPgWQjzXdDfBzTEIqiQ+LaKlsEwwxxqljvqN9R6sSljyo/YoEL l7WQ== MIME-Version: 1.0 X-Received: by 10.66.216.234 with SMTP id ot10mr6287809pac.122.1382660112333; Thu, 24 Oct 2013 17:15:12 -0700 (PDT) Received: by 10.68.11.2 with HTTP; Thu, 24 Oct 2013 17:15:12 -0700 (PDT) In-Reply-To: References: <1382550048.43545.YahooMailNeo@web141203.mail.bf1.yahoo.com> Date: Fri, 25 Oct 2013 08:15:12 +0800 Message-ID: Subject: Re: map container is assigned default memory size rather than user configured which will cause TaskAttempt failure From: Manu Zhang To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b5d660cd7430004e985a44e X-Virus-Checked: Checked by ClamAV on apache.org --047d7b5d660cd7430004e985a44e Content-Type: text/plain; charset=ISO-8859-1 My mapreduce.map.java.opts is 1024MB Thanks, Manu On Thu, Oct 24, 2013 at 3:11 PM, Tsuyoshi OZAWA wrote: > Hi, > > How about checking the value of mapreduce.map.java.opts? Are your JVMs > launched with assumed heap memory? > > On Thu, Oct 24, 2013 at 11:31 AM, Manu Zhang > wrote: > > Just confirmed the problem still existed even the "mapred-site.xml"s on > all > > nodes have the same configuration (mapreduce.map.memory.mb = 2560). > > > > Any more thoughts ? > > > > Thanks, > > Manu > > > > > > On Thu, Oct 24, 2013 at 8:59 AM, Manu Zhang > wrote: > >> > >> Thanks Ravi. > >> > >> I do have mapred-site.xml under /etc/hadoop/conf/ on those nodes but it > >> sounds weird to me should they read configuration from those > mapred-site.xml > >> since it's the client who applies for the resource. I have another > >> mapred-site.xml in the directory where I run my job. I suppose my job > should > >> read conf from that mapred-site.xml. Please correct me if I am mistaken. > >> > >> Also, not always the same nodes. The number of failures is random, too. > >> > >> Anyway, I will have my settings in all the nodes' mapred-site.xml and > see > >> if the problem goes away. > >> > >> Manu > >> > >> > >> On Thu, Oct 24, 2013 at 1:40 AM, Ravi Prakash > wrote: > >>> > >>> Manu! > >>> > >>> This should not be the case. All tasks should have the configuration > >>> values you specified propagated to them. Are you sure your setup is > correct? > >>> Are they always the same nodes which run with 1024Mb? Perhaps you have > >>> mapred-site.xml on those nodes? > >>> > >>> HTH > >>> Ravi > >>> > >>> > >>> On Tuesday, October 22, 2013 9:09 PM, Manu Zhang > >>> wrote: > >>> Hi, > >>> > >>> I've been running Terasort on Hadoop-2.0.4. > >>> > >>> Every time there is s a small number of Map failures (like 4 or 5) > >>> because of container's running beyond virtual memory limit. > >>> > >>> I've set mapreduce.map.memory.mb to a safe value (like 2560MB) so most > >>> TaskAttempt goes fine while the values of those failed maps are the > default > >>> 1024MB. > >>> > >>> My question is thus, why a small number of container's memory values > are > >>> set to default rather than that of user-configured ? > >>> > >>> Any thoughts ? > >>> > >>> Thanks, > >>> Manu Zhang > >>> > >>> > >>> > >> > > > > > > -- > - Tsuyoshi > --047d7b5d660cd7430004e985a44e Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
My mapreduce.map.java.opts is 1024MB

Th= anks,
Manu


On Thu, Oct 24, 2013 at 3:11 PM, Tsuyoshi OZAWA <ozawa.tsuyoshi@gmail.com> wrote:
Hi,

How about checking the value of mapreduce.map.java.opts? Are your JVMs
launched with assumed heap memory?

On Thu, Oct 24, 2013 at 11:31 AM, Manu Zhang <owenzhang1990@gmail.com> wrote:
> Just confirmed the problem still existed even the "mapred-site.xm= l"s on all
> nodes have the same configuration (mapreduce.map.memory.mb =3D 2560).<= br> >
> Any more thoughts ?
>
> Thanks,
> Manu
>
>
> On Thu, Oct 24, 2013 at 8:59 AM, Manu Zhang <owenzhang1990@gmail.com> wrote:
>>
>> Thanks Ravi.
>>
>> I do have mapred-site.xml under /etc/hadoop/conf/ on those nodes b= ut it
>> sounds weird to me should they read configuration from those mapre= d-site.xml
>> since it's the client who applies for the resource. I have ano= ther
>> mapred-site.xml in the directory where I run my job. I suppose my = job should
>> read conf from that mapred-site.xml. Please correct me if I am mis= taken.
>>
>> Also, not always the same nodes. The number of failures is random,= too.
>>
>> Anyway, I will have my settings in all the nodes' mapred-site.= xml and see
>> if the problem goes away.
>>
>> Manu
>>
>>
>> On Thu, Oct 24, 2013 at 1:40 AM, Ravi Prakash <ravihoo@ymail.com> wrote:
>>>
>>> Manu!
>>>
>>> This should not be the case. All tasks should have the configu= ration
>>> values you specified propagated to them. Are you sure your set= up is correct?
>>> Are they always the same nodes which run with 1024Mb? Perhaps = you have
>>> mapred-site.xml on those nodes?
>>>
>>> HTH
>>> Ravi
>>>
>>>
>>> On Tuesday, October 22, 2013 9:09 PM, Manu Zhang
>>> <owenzhang1990@g= mail.com> wrote:
>>> Hi,
>>>
>>> I've been running Terasort on Hadoop-2.0.4.
>>>
>>> Every time there is s a small number of Map failures (like 4 o= r 5)
>>> because of container's running beyond virtual memory limit= .
>>>
>>> I've set mapreduce.map.memory.mb to a safe value (like 256= 0MB) so most
>>> TaskAttempt goes fine while the values of those failed maps ar= e the default
>>> 1024MB.
>>>
>>> My question is thus, why a small number of container's mem= ory values are
>>> set to default rather than that of user-configured ?
>>>
>>> Any thoughts ?
>>>
>>> Thanks,
>>> Manu Zhang
>>>
>>>
>>>
>>
>



--
- Tsuyoshi

--047d7b5d660cd7430004e985a44e--