Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of geelongyao@gmail.com designates
 209.85.128.43 as permitted sender)
MIME-Version: 1.0
Date: Mon, 22 Apr 2013 17:28:42 -0700 (PDT)
Message-Id: <1366676922209.b9f545d6@Nodemailer>
In-Reply-To: <5175B657.600@plutoz.com>
References: <5175B657.600@plutoz.com>
From: "=?UTF-8?Q?=E5=A7=9A=E5=90=89=E9=BE=99?=" <geelongyao@gmail.com>
To: user@hadoop.apache.org
Subject: Re: common error in map tasks
Content-Type: multipart/alternative;
 boundary="----mailcomposer-?=_1-1366676922829"

------mailcomposer-?=_1-1366676922829
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

mainly it is caused by java.child.opt and the num of map task.
=E2=80=94
Sent from Mailbox for iPhone

On Tue, Apr 23, 2013 at 6:15 AM, kaveh minooie <kaveh@plutoz.com> wrote:

> thanks Chris. I only run nutch, so no to the external command. And I=20
> just checked and it happens or has happened on all the nodes at some=20
> point. I have to say thou that it doesn't cause the job to fail or=20
> anything. the map tasks that fail will finish when they are re-spawn=20
> again. it is just annoying and makes me think that some value some =
where=20
> in the config files are either not correct or not optimal.
> On 04/22/2013 02:49 PM, Chris Nauroth wrote:
>> I'm not aware of any Hadoop-specific meaning for exit code 126.
>>   Typically, this is a standard Unix exit code used to indicate that a
>> command couldn't be executed.  Some reasons for this might be that the
>> command is not an executable file, or the command is an executable file
>> but the user doesn't have execute permissions.  (See below for an
>> example of each of these.)
>>
>> Does your job code attempt to exec an external command=3F  Also, are =
the
>> task failures consistently happening on the same set of nodes in your
>> cluster=3F  If so, then I recommend checking that the command has been
>> deployed and has the correct permissions on those nodes.
>>
>> Even if your code doesn't exec an external command, various parts of =
the
>> Hadoop code do this internally, so you still might have a case of a
>> misconfigured node.
>>
>> Hope this helps,
>> --Chris
>>
>> [chris@Chriss-MacBook-Pro:ttys000] hadoop-common
>>  > ./BUILDING.txt
>> -bash: ./BUILDING.txt: Permission denied
>> [chris@Chriss-MacBook-Pro:ttys000] hadoop-common
>>  > echo $=3F
>> 126
>>
>> [chris@Chriss-MacBook-Pro:ttys000] test
>>  > ls -lrt exec
>> -rwx------  1 root  staff     0B Apr 22 14:43 exec*
>> [chris@Chriss-MacBook-Pro:ttys000] test
>>  > whoami
>> chris
>> [chris@Chriss-MacBook-Pro:ttys000] test
>>  > ./exec
>> bash: ./exec: Permission denied
>> [chris@Chriss-MacBook-Pro:ttys000] test
>>  > echo $=3F
>> 126
>>
>>
>>
>> On Mon, Apr 22, 2013 at 2:09 PM, kaveh minooie <kaveh@plutoz.com
>> <mailto:kaveh@plutoz.com>> wrote:
>>
>>     thanks. that is the issue, there is no other log files. when i go =
to
>>     the attempt directory of that failed map task (e.g.
>>     userlogs/job=5F201304191712=5F=5F=5F0015/attempt=5F201304191712=5F=
=5F=5F0015=5Fm=5F000019=5F0
>>     ) it is empty. there is no other log file. thou based on the =
counter
>>     value, I can say that it happens right at the beginning of the map
>>     task (counter is only 1 )
>>
>>
>>
>>
>>     On 04/22/2013 02:12 AM, =E5=A7=9A=E5=90=89=E9=BE=99 wrote:
>>
>>         Hi
>>
>>
>>         I have the same problem before
>>         I think this is caused by the lack of memory shortage for map =
task.
>>         It is just a suggestion,you can post your log
>>
>>
>>         BRs
>>         Geelong
>>         =E2=80=94
>>         Sent from Mailbox <https://bit.ly/SZvoJe> for iPhone
>>
>>
>>
>>         On Mon, Apr 22, 2013 at 4:34 PM, kaveh minooie <kaveh@plutoz.=
com
>>         <mailto:kaveh@plutoz.com>
>>         <mailto:kaveh@plutoz.com <mailto:kaveh@plutoz.com>>> wrote:
>>
>>              HI
>>
>>              regardless of what job I run, there are always a few map
>>         tasks that
>>              fail with the following, very unhelpful, message: ( that is=
 the
>>              entire error message)
>>
>>              java.lang.Throwable: Child Error
>>                  at
>>         org.apache.hadoop.mapred.=5F=5FTaskRunner.run(TaskRunner.=
=5F=5Fjava:271)
>>              Caused by: java.io.IOException: Task process exit with
>>         nonzero status of 126.
>>                  at
>>         org.apache.hadoop.mapred.=5F=5FTaskRunner.run(TaskRunner.=
=5F=5Fjava:258)
>>
>>
>>              I would appreciate it if someone could show me how I could
>>         figure
>>              out why this error keeps happening.
>>
>>              thanks,
>>
>>
>>
>>     --
>>     Kaveh Minooie
>>
>>
> --=20
> Kaveh Minooie
------mailcomposer-?=_1-1366676922829
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

<p>mainly it is caused by java.child.opt and the num of map task.</p><div =
class=3D=22mailbox=5Fsignature=22>=E2=80=94<br>Sent from <a =
href=3D=22https://bit.ly/SZvoJe=22>Mailbox</a> for iPhone</div><br><br><div=
 class=3D=22gmail=5Fquote=22><p>On Tue, Apr 23, 2013 at 6:15 AM, kaveh =
minooie <span dir=3D=22ltr=22>&lt;<a href=3D=22mailto:kaveh@plutoz.com=22 =
target=3D=22=5Fblank=22>kaveh@plutoz.com</a>&gt;</span> =
wrote:<br></p><blockquote class=3D=22gmail=5Fquote=22 style=3D=22margin:0 0=
 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;=22><p>thanks Chris. I =
only run nutch, so no to the external command. And I=20
<br>just checked and it happens or has happened on all the nodes at =
some=20
<br>point. I have to say thou that it doesn't cause the job to fail or=20
<br>anything. the map tasks that fail will finish when they are =
re-spawn=20
<br>again. it is just annoying and makes me think that some value some =
where=20
<br>in the config files are either not correct or not optimal.
<br><br><br>On 04/22/2013 02:49 PM, Chris Nauroth wrote:
<br>&gt; I'm not aware of any Hadoop-specific meaning for exit code 126.
<br>&gt;   Typically, this is a standard Unix exit code used to indicate =
that a
<br>&gt; command couldn't be executed.  Some reasons for this might be that=
 the
<br>&gt; command is not an executable file, or the command is an executable=
 file
<br>&gt; but the user doesn't have execute permissions.  (See below for an
<br>&gt; example of each of these.)
<br>&gt;
<br>&gt; Does your job code attempt to exec an external command=3F  Also, =
are the
<br>&gt; task failures consistently happening on the same set of nodes in =
your
<br>&gt; cluster=3F  If so, then I recommend checking that the command has =
been
<br>&gt; deployed and has the correct permissions on those nodes.
<br>&gt;
<br>&gt; Even if your code doesn't exec an external command, various parts =
of the
<br>&gt; Hadoop code do this internally, so you still might have a case of =
a
<br>&gt; misconfigured node.
<br>&gt;
<br>&gt; Hope this helps,
<br>&gt; --Chris
<br>&gt;
<br>&gt; [chris@Chriss-MacBook-Pro:ttys000] hadoop-common
<br>&gt;  &gt; ./BUILDING.txt
<br>&gt; -bash: ./BUILDING.txt: Permission denied
<br>&gt; [chris@Chriss-MacBook-Pro:ttys000] hadoop-common
<br>&gt;  &gt; echo $=3F
<br>&gt; 126
<br>&gt;
<br>&gt; [chris@Chriss-MacBook-Pro:ttys000] test
<br>&gt;  &gt; ls -lrt exec
<br>&gt; -rwx------  1 root  staff     0B Apr 22 14:43 exec*
<br>&gt; [chris@Chriss-MacBook-Pro:ttys000] test
<br>&gt;  &gt; whoami
<br>&gt; chris
<br>&gt; [chris@Chriss-MacBook-Pro:ttys000] test
<br>&gt;  &gt; ./exec
<br>&gt; bash: ./exec: Permission denied
<br>&gt; [chris@Chriss-MacBook-Pro:ttys000] test
<br>&gt;  &gt; echo $=3F
<br>&gt; 126
<br>&gt;
<br>&gt;
<br>&gt;
<br>&gt; On Mon, Apr 22, 2013 at 2:09 PM, kaveh minooie &lt;kaveh@plutoz.=
com
<br>&gt; &lt;mailto:kaveh@plutoz.com&gt;&gt; wrote:
<br>&gt;
<br>&gt;     thanks. that is the issue, there is no other log files. when i=
 go to
<br>&gt;     the attempt directory of that failed map task (e.g.
<br>&gt;     userlogs/job=5F201304191712=5F=5F=5F0015/attempt=5F20130419171=
2=5F=5F=5F0015=5Fm=5F000019=5F0
<br>&gt;     ) it is empty. there is no other log file. thou based on the =
counter
<br>&gt;     value, I can say that it happens right at the beginning of the=
 map
<br>&gt;     task (counter is only 1 )
<br>&gt;
<br>&gt;
<br>&gt;
<br>&gt;
<br>&gt;     On 04/22/2013 02:12 AM, =E5=A7=9A=E5=90=89=E9=BE=99 wrote:
<br>&gt;
<br>&gt;         Hi
<br>&gt;
<br>&gt;
<br>&gt;         I have the same problem before
<br>&gt;         I think this is caused by the lack of memory shortage for =
map task.
<br>&gt;         It is just a suggestion,you can post your log
<br>&gt;
<br>&gt;
<br>&gt;         BRs
<br>&gt;         Geelong
<br>&gt;         =E2=80=94
<br>&gt;         Sent from Mailbox &lt;https://bit.ly/SZvoJe&gt; for =
iPhone
<br>&gt;
<br>&gt;
<br>&gt;
<br>&gt;         On Mon, Apr 22, 2013 at 4:34 PM, kaveh minooie =
&lt;kaveh@plutoz.com
<br>&gt;         &lt;mailto:kaveh@plutoz.com&gt;
<br>&gt;         &lt;mailto:kaveh@plutoz.com &lt;mailto:kaveh@plutoz.=
com&gt;&gt;&gt; wrote:
<br>&gt;
<br>&gt;              HI
<br>&gt;
<br>&gt;              regardless of what job I run, there are always a few =
map
<br>&gt;         tasks that
<br>&gt;              fail with the following, very unhelpful, message: ( =
that is the
<br>&gt;              entire error message)
<br>&gt;
<br>&gt;              java.lang.Throwable: Child Error
<br>&gt;                  at
<br>&gt;         org.apache.hadoop.mapred.=5F=5FTaskRunner.run(TaskRunner.=
=5F=5Fjava:271)
<br>&gt;              Caused by: java.io.IOException: Task process exit =
with
<br>&gt;         nonzero status of 126.
<br>&gt;                  at
<br>&gt;         org.apache.hadoop.mapred.=5F=5FTaskRunner.run(TaskRunner.=
=5F=5Fjava:258)
<br>&gt;
<br>&gt;
<br>&gt;              I would appreciate it if someone could show me how I =
could
<br>&gt;         figure
<br>&gt;              out why this error keeps happening.
<br>&gt;
<br>&gt;              thanks,
<br>&gt;
<br>&gt;
<br>&gt;
<br>&gt;     --
<br>&gt;     Kaveh Minooie
<br>&gt;
<br>&gt;
<br><br>--=20
<br>Kaveh Minooie
<br></p></blockquote></div><br>
------mailcomposer-?=_1-1366676922829--