Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AB8D810138 for ; Tue, 23 Apr 2013 00:29:15 +0000 (UTC) Received: (qmail 83836 invoked by uid 500); 23 Apr 2013 00:29:10 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 83712 invoked by uid 500); 23 Apr 2013 00:29:10 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 83701 invoked by uid 99); 23 Apr 2013 00:29:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Apr 2013 00:29:10 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,URIBL_DBL_REDIR X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of geelongyao@gmail.com designates 209.85.128.43 as permitted sender) Received: from [209.85.128.43] (HELO mail-qe0-f43.google.com) (209.85.128.43) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Apr 2013 00:29:04 +0000 Received: by mail-qe0-f43.google.com with SMTP id f6so38273qej.16 for ; Mon, 22 Apr 2013 17:28:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:x-mailer:date:message-id:in-reply-to :references:x-orchestra-oid:x-orchestra-sig:x-orchestra-thrid :x-orchestra-thrid-sig:x-orchestra-account:from:to:subject :content-type; bh=9x/W6Scx6/3SjIvVrCDJwcNuLvfR7BisO5DDgBU+GrQ=; b=M1VH+n9rJEbxTo9lrCN45XfzkiVpj83Sd5e6yqYnAXBOOCdNER7DeemSeMwvqQM4If uG+5SYHze+Veq2nul+bN3fNfPu8V2VLt0LFow/2dMOPGH9AKuPCbMs3kG2VI/vMoYCql ykaaTcVUdOFi4jsyBuT7gR63Zcpk+pjPed0q/nQGiSHERmxGItXz561BKDJyLYbnEh3G ObvEJYiRlYpGhD49ZtQI178MhKHxp0j2ZF8UYslOF3unRVi3ibKMH5YR2pSfmpxEb2Yb A8Cq/Lh9SAd3D1uFViGYbAmTo/D+rhZGp2frK/4TcOwxzEZ80ww2yOTBsnPcFUY3vn9i LS2Q== X-Received: by 10.229.106.205 with SMTP id y13mr2730046qco.61.1366676923407; Mon, 22 Apr 2013 17:28:43 -0700 (PDT) Received: from [127.0.0.1] (mail-2.mailboxapp.com. [54.235.133.3]) by mx.google.com with ESMTPS id m8sm30338363qav.8.2013.04.22.17.28.42 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 22 Apr 2013 17:28:42 -0700 (PDT) MIME-Version: 1.0 X-Mailer: Nodemailer (0.3.28; +http://andris9.github.com/Nodemailer/) Date: Mon, 22 Apr 2013 17:28:42 -0700 (PDT) Message-Id: <1366676922209.b9f545d6@Nodemailer> In-Reply-To: <5175B657.600@plutoz.com> References: <5175B657.600@plutoz.com> X-Orchestra-Oid: 6B4DC418-942D-41B7-9E74-CCB51CA0BDDC X-Orchestra-Sig: e3a80d4be0c777902ecba641271f8a11ca4cea2d X-Orchestra-Thrid: T71E25CF3-6EA0-4367-824C-0FD1BBD8DD45_1433004613145224926 X-Orchestra-Thrid-Sig: c2d8ef18b495ae36d5fe09c128078e9e6b6f6557 X-Orchestra-Account: bd672aeaef64c146a1386e5dfa0b3efd28927902 From: "=?UTF-8?Q?=E5=A7=9A=E5=90=89=E9=BE=99?=" To: user@hadoop.apache.org Subject: Re: common error in map tasks Content-Type: multipart/alternative; boundary="----mailcomposer-?=_1-1366676922829" X-Virus-Checked: Checked by ClamAV on apache.org ------mailcomposer-?=_1-1366676922829 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable mainly it is caused by java.child.opt and the num of map task. =E2=80=94 Sent from Mailbox for iPhone On Tue, Apr 23, 2013 at 6:15 AM, kaveh minooie wrote: > thanks Chris. I only run nutch, so no to the external command. And I=20 > just checked and it happens or has happened on all the nodes at some=20 > point. I have to say thou that it doesn't cause the job to fail or=20 > anything. the map tasks that fail will finish when they are re-spawn=20 > again. it is just annoying and makes me think that some value some = where=20 > in the config files are either not correct or not optimal. > On 04/22/2013 02:49 PM, Chris Nauroth wrote: >> I'm not aware of any Hadoop-specific meaning for exit code 126. >> Typically, this is a standard Unix exit code used to indicate that a >> command couldn't be executed. Some reasons for this might be that the >> command is not an executable file, or the command is an executable file >> but the user doesn't have execute permissions. (See below for an >> example of each of these.) >> >> Does your job code attempt to exec an external command=3F Also, are = the >> task failures consistently happening on the same set of nodes in your >> cluster=3F If so, then I recommend checking that the command has been >> deployed and has the correct permissions on those nodes. >> >> Even if your code doesn't exec an external command, various parts of = the >> Hadoop code do this internally, so you still might have a case of a >> misconfigured node. >> >> Hope this helps, >> --Chris >> >> [chris@Chriss-MacBook-Pro:ttys000] hadoop-common >> > ./BUILDING.txt >> -bash: ./BUILDING.txt: Permission denied >> [chris@Chriss-MacBook-Pro:ttys000] hadoop-common >> > echo $=3F >> 126 >> >> [chris@Chriss-MacBook-Pro:ttys000] test >> > ls -lrt exec >> -rwx------ 1 root staff 0B Apr 22 14:43 exec* >> [chris@Chriss-MacBook-Pro:ttys000] test >> > whoami >> chris >> [chris@Chriss-MacBook-Pro:ttys000] test >> > ./exec >> bash: ./exec: Permission denied >> [chris@Chriss-MacBook-Pro:ttys000] test >> > echo $=3F >> 126 >> >> >> >> On Mon, Apr 22, 2013 at 2:09 PM, kaveh minooie > > wrote: >> >> thanks. that is the issue, there is no other log files. when i go = to >> the attempt directory of that failed map task (e.g. >> userlogs/job=5F201304191712=5F=5F=5F0015/attempt=5F201304191712=5F= =5F=5F0015=5Fm=5F000019=5F0 >> ) it is empty. there is no other log file. thou based on the = counter >> value, I can say that it happens right at the beginning of the map >> task (counter is only 1 ) >> >> >> >> >> On 04/22/2013 02:12 AM, =E5=A7=9A=E5=90=89=E9=BE=99 wrote: >> >> Hi >> >> >> I have the same problem before >> I think this is caused by the lack of memory shortage for map = task. >> It is just a suggestion,you can post your log >> >> >> BRs >> Geelong >> =E2=80=94 >> Sent from Mailbox for iPhone >> >> >> >> On Mon, Apr 22, 2013 at 4:34 PM, kaveh minooie > >> >> wrote: >> >> HI >> >> regardless of what job I run, there are always a few map >> tasks that >> fail with the following, very unhelpful, message: ( that is= the >> entire error message) >> >> java.lang.Throwable: Child Error >> at >> org.apache.hadoop.mapred.=5F=5FTaskRunner.run(TaskRunner.= =5F=5Fjava:271) >> Caused by: java.io.IOException: Task process exit with >> nonzero status of 126. >> at >> org.apache.hadoop.mapred.=5F=5FTaskRunner.run(TaskRunner.= =5F=5Fjava:258) >> >> >> I would appreciate it if someone could show me how I could >> figure >> out why this error keeps happening. >> >> thanks, >> >> >> >> -- >> Kaveh Minooie >> >> > --=20 > Kaveh Minooie ------mailcomposer-?=_1-1366676922829 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable

mainly it is caused by java.child.opt and the num of map task.

=E2=80=94
Sent from Mailbox for iPhone


On Tue, Apr 23, 2013 at 6:15 AM, kaveh = minooie <kaveh@plutoz.com> = wrote:

thanks Chris. I = only run nutch, so no to the external command. And I=20
just checked and it happens or has happened on all the nodes at = some=20
point. I have to say thou that it doesn't cause the job to fail or=20
anything. the map tasks that fail will finish when they are = re-spawn=20
again. it is just annoying and makes me think that some value some = where=20
in the config files are either not correct or not optimal.


On 04/22/2013 02:49 PM, Chris Nauroth wrote:
> I'm not aware of any Hadoop-specific meaning for exit code 126.
> Typically, this is a standard Unix exit code used to indicate = that a
> command couldn't be executed. Some reasons for this might be that= the
> command is not an executable file, or the command is an executable= file
> but the user doesn't have execute permissions. (See below for an
> example of each of these.)
>
> Does your job code attempt to exec an external command=3F Also, = are the
> task failures consistently happening on the same set of nodes in = your
> cluster=3F If so, then I recommend checking that the command has = been
> deployed and has the correct permissions on those nodes.
>
> Even if your code doesn't exec an external command, various parts = of the
> Hadoop code do this internally, so you still might have a case of = a
> misconfigured node.
>
> Hope this helps,
> --Chris
>
> [chris@Chriss-MacBook-Pro:ttys000] hadoop-common
> > ./BUILDING.txt
> -bash: ./BUILDING.txt: Permission denied
> [chris@Chriss-MacBook-Pro:ttys000] hadoop-common
> > echo $=3F
> 126
>
> [chris@Chriss-MacBook-Pro:ttys000] test
> > ls -lrt exec
> -rwx------ 1 root staff 0B Apr 22 14:43 exec*
> [chris@Chriss-MacBook-Pro:ttys000] test
> > whoami
> chris
> [chris@Chriss-MacBook-Pro:ttys000] test
> > ./exec
> bash: ./exec: Permission denied
> [chris@Chriss-MacBook-Pro:ttys000] test
> > echo $=3F
> 126
>
>
>
> On Mon, Apr 22, 2013 at 2:09 PM, kaveh minooie <kaveh@plutoz.= com
> <mailto:kaveh@plutoz.com>> wrote:
>
> thanks. that is the issue, there is no other log files. when i= go to
> the attempt directory of that failed map task (e.g.
> userlogs/job=5F201304191712=5F=5F=5F0015/attempt=5F20130419171= 2=5F=5F=5F0015=5Fm=5F000019=5F0
> ) it is empty. there is no other log file. thou based on the = counter
> value, I can say that it happens right at the beginning of the= map
> task (counter is only 1 )
>
>
>
>
> On 04/22/2013 02:12 AM, =E5=A7=9A=E5=90=89=E9=BE=99 wrote:
>
> Hi
>
>
> I have the same problem before
> I think this is caused by the lack of memory shortage for = map task.
> It is just a suggestion,you can post your log
>
>
> BRs
> Geelong
> =E2=80=94
> Sent from Mailbox <https://bit.ly/SZvoJe> for = iPhone
>
>
>
> On Mon, Apr 22, 2013 at 4:34 PM, kaveh minooie = <kaveh@plutoz.com
> <mailto:kaveh@plutoz.com>
> <mailto:kaveh@plutoz.com <mailto:kaveh@plutoz.= com>>> wrote:
>
> HI
>
> regardless of what job I run, there are always a few = map
> tasks that
> fail with the following, very unhelpful, message: ( = that is the
> entire error message)
>
> java.lang.Throwable: Child Error
> at
> org.apache.hadoop.mapred.=5F=5FTaskRunner.run(TaskRunner.= =5F=5Fjava:271)
> Caused by: java.io.IOException: Task process exit = with
> nonzero status of 126.
> at
> org.apache.hadoop.mapred.=5F=5FTaskRunner.run(TaskRunner.= =5F=5Fjava:258)
>
>
> I would appreciate it if someone could show me how I = could
> figure
> out why this error keeps happening.
>
> thanks,
>
>
>
> --
> Kaveh Minooie
>
>

--=20
Kaveh Minooie


------mailcomposer-?=_1-1366676922829--