hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henning Blohm <henning.bl...@zfabrik.de>
Subject Re: hanging map reduce processes
Date Thu, 27 Oct 2011 15:44:14 GMT
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Hi Harsh,<br>
    <br>
    here's the simplest example I could come up with: Add <br>
    <br>
    &nbsp;&nbsp;&nbsp; protected void setup(Context context) throws IOException
    ,InterruptedException {
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; // start some non-deamon
thread
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Thread t = new
Thread(new Runnable() {
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
public void run() {
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
while (true) {
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
try {
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
Thread.sleep(1000);
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
} catch (InterruptedException e) {
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
e.printStackTrace();
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
}
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
}
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
}
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; });
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; t.setDaemon(false);
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; t.start();
    <br>
    &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; System.err.println("Started
thread in reduce setup");
    <br>
    &nbsp;&nbsp;&nbsp; };
    <br>
    <br>
    to the Reduce inner class in the wordcount sample (source code
    attached). <br>
    <br>
    Assuming its in wordcount.jar and files have been uploaded for
    counting (no matter what content of course), running<br>
    <br>
    <div class="moz-text-flowed" style="font-family: -moz-fixed;
      font-size: 12px;" lang="x-western">hadoop jar wordcount.jar
      org.myorg.WordCount wordcount/input wordcount/result
      <br>
    </div>
    <br>
    gives me, reproducibly, a hanging "Child" process. Interestingly,
    that does not happen when starting a thread like above but in
    Map.setup.<br>
    <br>
    One more note: In our case, some non-trivial infrastructure is
    started and used in map, combine, and reduce. I believe it could be
    shutdown and started again between map and reduce when run in the
    same JVM. That is however expensive and brings no benefit otherwise.
    If there would be a way to know that now the JVM will really not be
    used anymore, that would be a good time to really cleanup.
    Unfortunately shutdown hooks don't work here as they will not be run
    before non-daemon threads have stopped.<br>
    <br>
    Thanks,<br>
    &nbsp; Henning<br>
    <br>
    <br>
    On 10/27/2011 01:18 PM, Henning Blohm wrote:
    <blockquote cite="mid:4EA93E06.2070808@zfabrik.de" type="cite">
      <pre wrap="">Hi Harsh,

that would be 0.20.3. Will try to prepare a stripped down sample later today or 
tomorrow.

Thanks,
Henning

On 10/27/2011 12:55 PM, Harsh J wrote:
&gt; Hey Henning,
&gt;
&gt; What version of Hadoop are you running, and can we have a dumbed down
&gt; sample to reproduce?
&gt;
&gt; On Thu, Oct 27, 2011 at 3:28 PM, Henning Blohm<a class="moz-txt-link-rfc2396E"
href="mailto:henning.blohm@zfabrik.de">&lt;henning.blohm@zfabrik.de&gt;</a>
 wrote:
&gt;&gt; Hi,
&gt;&gt;
&gt;&gt; found that several people have run into this issue, but I was not able to
&gt;&gt; find a solution yet.
&gt;&gt;
&gt;&gt; We have reduce tasks that leave a hanging "child" process. The
&gt;&gt; implementation uses a lot of third party stuff and leave Timer threads
&gt;&gt; running (as you can readily see in thread dumps). Which is bad style - no
&gt;&gt; doubt. But eventually we don't really care - when the reduce is done, its
&gt;&gt; done and the process should be really just killed rather than hanging around
&gt;&gt; and eventually impacting the cluster.
&gt;&gt;
&gt;&gt; Is there a way to force killing of child processes, e.g. based on job
&gt;&gt; configuration?
&gt;&gt;
&gt;&gt; Thanks,
&gt;&gt;   Henning
&gt;&gt;
&gt;


-- 

*Henning Blohm*

*ZFabrik Software KG*

T: 	+49/62278399955
F: 	+49/62278399956
M: 	+49/1781891820

Bunsenstrasse 1
69190 Walldorf

<a class="moz-txt-link-abbreviated" href="mailto:henning.blohm@zfabrik.de">henning.blohm@zfabrik.de</a>
<a class="moz-txt-link-rfc2396E" href="mailto:henning.blohm@zfabrik.de">&lt;mailto:henning.blohm@zfabrik.de&gt;</a>
Linkedin <a class="moz-txt-link-rfc2396E" href="http://de.linkedin.com/pub/henning-blohm/0/7b5/628">&lt;http://de.linkedin.com/pub/henning-blohm/0/7b5/628&gt;</a>
<a class="moz-txt-link-abbreviated" href="http://www.zfabrik.de">www.zfabrik.de</a>
<a class="moz-txt-link-rfc2396E" href="http://www.zfabrik.de">&lt;http://www.zfabrik.de&gt;</a>
<a class="moz-txt-link-abbreviated" href="http://www.z2-environment.eu">www.z2-environment.eu</a>
<a class="moz-txt-link-rfc2396E" href="http://www.z2-environment.eu">&lt;http://www.z2-environment.eu&gt;</a>

</pre>
    </blockquote>
    <br>
    <br>
  </body>
</html>

Mime
View raw message