Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: 
 <CADVHTB_ED0kQhDnqsyuHHZupB6-hmv_5V9v3hoU4k4fev91xqg@mail.gmail.com>
References: 
 <CADVHTB_ED0kQhDnqsyuHHZupB6-hmv_5V9v3hoU4k4fev91xqg@mail.gmail.com>
Date: Thu, 13 Sep 2012 10:39:27 -0700
Message-ID: 
 <CA+R1ZMMcDM_1L9e8r2NU7vLoLbogTo_pBLUYQ2VXgj1N3kz3Vw@mail.gmail.com>
Subject: Re: Hadoop failing jobs non zero exit status 7
From: Aaron Eng <aeng@maprtech.com>
To: user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=20cf307f31e2f6c0f904c998c941

--20cf307f31e2f6c0f904c998c941
Content-Type: text/plain; charset=ISO-8859-1

Hi Robin,

"Task process exit with nonzero status of 7." is being printed by the
TaskTracker to indicate the child JVM spawned to run the task attempt in
question exited unexpectedly. This also means the task was not killed
administratively (either by TaskTracker or by you, the admin).  So
basically, the TaskTracker tried to launch a JVM and it exited.

You didn't post all the details for the attempt from the TaskTracker log so
it's hard to say the specifics of when/how this happened.  And I'm not
familiar with exit code 7 being returned by a JVM but this would have been
generated by the JVM process itself, not any user code you tried to run in
the attempt.  It could be that the JVM has some internal issue, some bug of
sorts, what java version are you using?  Or it could be the JVM needs
something from the environment that is not available/permissible in the
context in which it is being executed.  So for instance, you could have
some limit in place in the execution environment of the tasktracker which
is being hit.

If nothing else, you can note down the way in which the JVM is being
spawned and try to spawn it manually and if its immediately reproducible,
knowing whether this comes up when you spawn it directly from the shell vs.
being spawned via TaskTracker is a useful bit of info.

If you can't identify the cause, feel free to post in answers.mapr.com or
send an email to support@mapr.com for some more assistance.

Best Regards,
Aaron Eng

On Thu, Sep 13, 2012 at 5:38 AM, Robin Verlangen <robin@us2.nl> wrote:

> Hi there,
>
> Today we started deploying Mapr M3 into production. However we're having
> problems completing jobs. During a typical job the job return this:
>
> 12/09/11 16:33:20 INFO mapred.JobClient: Task Id : attempt_201209111629_0002_r_000001_2, Status : FAILED on node cl004.flxviz.com
> java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stdout
> 12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output http://cl004.flxviz.com:50060/tasklog?plaintext=true&attemptid=attempt_201209111629_0002_r_000001_2&filter=stderr*
>
> When I get the logs of the tasktracker, I see things like:
>
> 2012-09-11 16:32:43,204 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_1: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl004.flxviz.com:localhost/127.0.0.1:53126
> 2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_r_000002_1'
> 2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_m_000011_2' to tip task_201209111629_0002_m_000011, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Error
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
> Caused by: java.io.IOException: Task process exit with nonzero status of 7.
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on tasktracker tracker_cl003.flxviz.com:localhost/127.0.0.1:42339
> 2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding task (JOB_SETUP) 'attempt_201209111629_0002_r_000002_2' to tip task_201209111629_0002_r_000002, for tracker 'tracker_cl003.flxviz.com:localhost/127.0.0.1:42339'
> 2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing task 'attempt_201209111629_0002_m_000011_2'
> 2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Error*
>
> Does anyone have a clue where to start? It doesn't seem to be a MapR
> specific problem, that's why I post this in the hadoop mailinglist.
>
> Some additional information:
> OS: Centos 6.3 x64
> 16GB Ram
> 2x quad core processor
> 12x 1TB harddrive
> Best regards,
>
> Robin Verlangen
> *Software engineer*
> *
> *
> W http://www.robinverlangen.nl
> E robin@us2.nl
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>

--20cf307f31e2f6c0f904c998c941
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hi Robin,<div><br></div><div>&quot;Task process exit with nonzero status of=
 7.&quot; is being printed by the TaskTracker to indicate the child JVM spa=
wned to run the task attempt in question exited unexpectedly. This also mea=
ns the task was not killed administratively (either by TaskTracker or by yo=
u, the admin). =A0So basically, the TaskTracker tried to launch a JVM and i=
t exited. =A0</div>
<div><br></div><div>You didn&#39;t post all the details for the attempt fro=
m the TaskTracker log so it&#39;s hard to say the specifics of when/how thi=
s happened. =A0And I&#39;m not familiar with exit code 7 being returned by =
a JVM but this would have been generated by the JVM process itself, not any=
 user code you tried to run in the attempt. =A0It could be that the JVM has=
 some internal issue, some bug of sorts, what java version are you using? =
=A0Or it could be the JVM needs something from the environment that is not =
available/permissible in the context in which it is being executed. =A0So f=
or instance, you could have some limit in place in the execution environmen=
t of the tasktracker which is being hit. =A0</div>
<div><br></div><div>If nothing else, you can note down the way in which the=
 JVM is being spawned and try to spawn it manually and if its immediately r=
eproducible, knowing whether this comes up when you spawn it directly from =
the shell vs. being spawned via TaskTracker is a useful bit of info.</div>
<div><br></div><div>If you can&#39;t identify the cause, feel free to post =
in <a href=3D"http://answers.mapr.com">answers.mapr.com</a> or send an emai=
l to <a href=3D"mailto:support@mapr.com">support@mapr.com</a> for some more=
 assistance.</div>
<div><br></div><div>Best Regards,</div><div>Aaron Eng</div><div><br><div cl=
ass=3D"gmail_quote">On Thu, Sep 13, 2012 at 5:38 AM, Robin Verlangen <span =
dir=3D"ltr">&lt;<a href=3D"mailto:robin@us2.nl" target=3D"_blank">robin@us2=
.nl</a>&gt;</span> wrote:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><p style=3D"line-height:18px;max-width:700px=
;font-size:13px;font-family:sans-serif;margin:0px 0px 13px;padding:0px;bord=
er:medium none">
Hi there,</p><p style=3D"line-height:18px;max-width:700px;font-size:13px;fo=
nt-family:sans-serif;margin:0px 0px 13px;padding:0px;border:medium none">
Today we started deploying Mapr M3 into production. However we&#39;re havin=
g problems completing jobs. During a typical job the job return this:</p><p=
re style=3D"background-color:rgb(245,245,245);font-family:Consolas,Monaco,&=
#39;Liberation Mono&#39;,&#39;Lucida Console&#39;,monospace;font-size:12px;=
margin-bottom:10px;overflow:auto;padding-left:5px;padding-top:5px;width:580=
px;white-space:pre-wrap;line-height:20px">
<code style=3D"font-family:Consolas,Monaco,&#39;Liberation Mono&#39;,&#39;L=
ucida Console&#39;,monospace;font-size:11px">12/09/11 16:33:20 INFO mapred.=
JobClient: Task Id : attempt_201209111629_0002_r_000001_2, Status : FAILED =
on node <a href=3D"http://cl004.flxviz.com/" style=3D"color:rgb(48,96,168);=
text-decoration:none" target=3D"_blank">cl004.flxviz.com</a>
java.lang.Throwable: Child Error
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
Caused by: java.io.IOException: Task process exit with nonzero status of 7.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254)
12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output <a href=
=3D"http://cl004.flxviz.com:50060/tasklog?plaintext=3Dtrue&amp;attemptid=3D=
attempt_201209111629_0002_r_000001_2&amp;filter=3Dstdout" style=3D"color:rg=
b(48,96,168);text-decoration:none" target=3D"_blank">http://cl004.flxviz.co=
m:50060/tasklog?plaintext=3Dtrue&amp;attemptid=3Dattempt_201209111629_0002_=
r_000001_2&amp;filter=3Dstdout</a>
12/09/11 16:33:20 WARN mapred.JobClient: Error reading task output <a href=
=3D"http://cl004.flxviz.com:50060/tasklog?plaintext=3Dtrue&amp;attemptid=3D=
attempt_201209111629_0002_r_000001_2&amp;filter=3Dstderr*" style=3D"color:r=
gb(48,96,168);text-decoration:none" target=3D"_blank">http://cl004.flxviz.c=
om:50060/tasklog?plaintext=3Dtrue&amp;attemptid=3Dattempt_201209111629_0002=
_r_000001_2&amp;filter=3Dstderr*</a>
</code></pre><p style=3D"line-height:18px;max-width:700px;font-size:13px;fo=
nt-family:sans-serif;margin:0px 0px 13px;padding:0px;border:medium none">Wh=
en I get the logs of the tasktracker, I see things like:</p>
<pre style=3D"background-color:rgb(245,245,245);font-family:Consolas,Monaco=
,&#39;Liberation Mono&#39;,&#39;Lucida Console&#39;,monospace;font-size:12p=
x;margin-bottom:10px;overflow:auto;padding-left:5px;padding-top:5px;width:5=
80px;white-space:pre-wrap;line-height:20px">
<code style=3D"font-family:Consolas,Monaco,&#39;Liberation Mono&#39;,&#39;L=
ucida Console&#39;,monospace;font-size:11px">2012-09-11 16:32:43,204 INFO o=
rg.apache.hadoop.mapred.TaskInProgress: Error from attempt_201209111629_000=
2_r_000002_1: java.lang.Throwable: Child Error
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
Caused by: java.io.IOException: Task process exit with nonzero status of 7.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on =
tasktracker <a href=3D"http://tracker_cl004.flxviz.com/" style=3D"color:rgb=
(48,96,168);text-decoration:none" target=3D"_blank">tracker_cl004.flxviz.co=
m</a>:localhost/<a href=3D"http://127.0.0.1:53126" target=3D"_blank">127.0.=
0.1:53126</a>
2012-09-11 16:32:46,234 INFO org.apache.hadoop.mapred.JobTracker: Removing =
task &#39;attempt_201209111629_0002_r_000002_1&#39;
2012-09-11 16:32:46,512 INFO org.apache.hadoop.mapred.JobTracker: Adding ta=
sk (JOB_SETUP) &#39;attempt_201209111629_0002_m_000011_2&#39; to tip task_2=
01209111629_0002_m_000011, for tracker &#39;tracker_cl003.flxviz.com:localh=
ost/<a href=3D"http://127.0.0.1:42339" target=3D"_blank">127.0.0.1:42339</a=
>&#39;
2012-09-11 16:32:48,027 INFO org.apache.hadoop.mapred.TaskInProgress: Error=
 from attempt_201209111629_0002_m_000011_2: java.lang.Throwable: Child Erro=
r
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:267)
Caused by: java.io.IOException: Task process exit with nonzero status of 7.
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:254) on =
tasktracker <a href=3D"http://tracker_cl003.flxviz.com/" style=3D"color:rgb=
(48,96,168);text-decoration:none" target=3D"_blank">tracker_cl003.flxviz.co=
m</a>:localhost/<a href=3D"http://127.0.0.1:42339" target=3D"_blank">127.0.=
0.1:42339</a>
2012-09-11 16:32:51,055 INFO org.apache.hadoop.mapred.JobTracker: Adding ta=
sk (JOB_SETUP) &#39;attempt_201209111629_0002_r_000002_2&#39; to tip task_2=
01209111629_0002_r_000002, for tracker &#39;tracker_cl003.flxviz.com:localh=
ost/<a href=3D"http://127.0.0.1:42339" target=3D"_blank">127.0.0.1:42339</a=
>&#39;
2012-09-11 16:32:51,056 INFO org.apache.hadoop.mapred.JobTracker: Removing =
task &#39;attempt_201209111629_0002_m_000011_2&#39;
2012-09-11 16:32:51,359 INFO org.apache.hadoop.mapred.TaskInProgress: Error=
 from attempt_201209111629_0002_r_000002_2: java.lang.Throwable: Child Erro=
r*
</code></pre><p style=3D"line-height:18px;max-width:700px;font-size:13px;fo=
nt-family:sans-serif;margin:0px 0px 13px;padding:0px;border:medium none">Do=
es anyone have a clue where to start? It doesn&#39;t seem to be a MapR spec=
ific problem, that&#39;s why I post this in the hadoop mailinglist.</p>

<p style=3D"line-height:18px;max-width:700px;font-size:13px;font-family:san=
s-serif;margin:0px 0px 13px;padding:0px;border:medium none">Some additional=
 information:<br>OS: Centos 6.3 x64<br>
16GB Ram<br>2x quad core processor<br>12x 1TB harddrive</p>Best regards,=A0=
<br><div><br></div><div>Robin Verlangen</div><div><i>Software engineer</i><=
/div><div><i><br></i></div><div>W <a href=3D"http://www.robinverlangen.nl" =
target=3D"_blank">http://www.robinverlangen.nl</a></div>

<div>E <a href=3D"mailto:robin@us2.nl" target=3D"_blank">robin@us2.nl</a></=
div><div><br></div><div><font color=3D"#666666">Disclaimer: The information=
 contained in this message and attachments is intended solely for the atten=
tion and use of the named addressee and may be confidential. If you are not=
 the intended recipient, you are reminded that the information remains the =
property of the sender. You must not use, disclose, distribute, copy, print=
 or rely on this e-mail. If you have received this message in error, please=
 contact the sender immediately and irrevocably delete this message and any=
 copies.</font></div>

<br>
</blockquote></div><br></div>

--20cf307f31e2f6c0f904c998c941--