hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Parks <davidpark...@yahoo.com>
Subject Re: Unexpected Hadoop behavior: map task re-running after reducer has been running
Date Mon, 11 Mar 2013 08:30:43 GMT
I should have included the obvious log files here.


Task attempt_201303080219_0002_r_005846_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_005857_0 failed to report status for 7203 seconds. Killing!
Task attempt_201303080219_0002_r_005858_0 failed to report status for 7200 seconds. Killing!
Map output lost, rescheduling: getMapOutput(attempt_201303080219_0002_m_006433_0,6169) failed
:
java.io.IOException: Error Reading IndexFile
        at org.apache.hadoop.mapred.IndexCache.readIndexFileToCache(IndexCache.java:113)
        at org.apache.hadoop.mapred.IndexCache.getIndexInformation(IndexCache.java:66)
        at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:3998)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
        at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221)
        at org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:842)
        at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
        at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
        at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
        at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
        at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
        at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
        at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
        at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
        at org.mortbay.jetty.Server.handle(Server.java:326)
        at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
        at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
        at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212)
        at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
        at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410)
        at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582)
Caused by: java.io.IOException: Premature EOF from inputStream
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:155)
        at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:73)
        at org.apache.hadoop.mapred.SpillRecord.<init>(SpillRecord.java:54)
        at org.apache.hadoop.mapred.IndexCache.readIndexFileToCache(IndexCache.java:109)
        ... 23 more

Task attempt_201303080219_0002_r_005908_0 failed to report status for 7200 seconds. Killing!
Task attempt_201303080219_0002_r_005962_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_005971_0 failed to report status for 7202 seconds. Killing!
Task attempt_201303080219_0002_r_005980_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_005983_0 failed to report status for 7202 seconds. Killing!
Task attempt_201303080219_0002_r_005988_0 failed to report status for 7200 seconds. Killing!
attempt_201303080219_0002_r_005988_0: [Fatal Error] :1:1: Premature end of file.
Task attempt_201303080219_0002_r_005986_0 failed to report status for 7202 seconds. Killing!
Task attempt_201303080219_0002_r_005992_0 failed to report status for 7202 seconds. Killing!
attempt_201303080219_0002_r_005992_0: [Fatal Error] :1:1: Premature end of file.
Task attempt_201303080219_0002_r_005993_0 failed to report status for 7200 seconds. Killing!
Task attempt_201303080219_0002_r_006004_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_005999_0 failed to report status for 7201 seconds. Killing!
attempt_201303080219_0002_r_005999_0: [Fatal Error] :1:1: Premature end of file.
Too many fetch-failures
Task attempt_201303080219_0002_r_006005_0 failed to report status for 7202 seconds. Killing!
attempt_201303080219_0002_r_006005_0: [Fatal Error] :1:1: Premature end of file.
Task attempt_201303080219_0002_r_006007_0 failed to report status for 7201 seconds. Killing!
attempt_201303080219_0002_r_006007_0: [Fatal Error] :1:1: Premature end of file.
Task attempt_201303080219_0002_r_006010_0 failed to report status for 7201 seconds. Killing!
Task attempt_201303080219_0002_r_006013_0 failed to report status for 7201 seconds. Killing!
Too many fetch-failures
Too many fetch-failures
Task attempt_201303080219_0002_r_006026_0 failed to report status for 7201 seconds. Killing!
attempt_201303080219_0002_r_006026_0: [Fatal Error] :1:1: Premature end of file.
Too many fetch-failures
Too many fetch-failures
Too many fetch-failures




________________________________
 From: David Parks <davidparks21@yahoo.com>
To: user@hadoop.apache.org 
Sent: Monday, March 11, 2013 3:23 PM
Subject: Unexpected Hadoop behavior: map task re-running after reducer has been running
 

I can’t explain this behavior, can someone help me here:
 
  Kind  % Complete Num Tasks Pending Running Complete Killed Failed/Killed Task Attempts
    map      100.00%    23547       0       1    23546     
0       247 / 0
   reduce     62.40%    10000    3738      30     6232      0      
336 / 0
 
This job has been running for 48 hours and the reducer is quite a ways through its processing.
But we’ve hit a snag (I’m not sure what snag exactly).
 
The map tasks were 100% complete, none running, but now I see 1 map task running. In a few
minutes that map task will finish and I’ll see “Running map tasks” change to 0, the
# of failed map tasks will increment by 1, and the map task will run again a short time thereafter.
This seems perpetual as we had 1 map task failed at the end of the map processing when the
reducer started running, now we have 247 failed map tasks and this is the pattern I’ve been
watching for hours now.
 
Anyone want to venture some guesses here?
 
Thanks,
David
Mime
View raw message