hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yoram Arnon (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-244) very long cleanup after a job fails
Date Tue, 23 May 2006 19:25:29 GMT
very long cleanup after a job fails
-----------------------------------

         Key: HADOOP-244
         URL: http://issues.apache.org/jira/browse/HADOOP-244
     Project: Hadoop
        Type: Bug

  Components: mapred  
    Reporter: Yoram Arnon
 Assigned to: Sameer Paranjpye 


Eight hours after a job failed (it executed for about 14 hours prior to failing), many task
trackers keep throwing the exceptions below:

060523 121732 Server handler 0 on 50040 caught: java.io.FileNotFoundException: LocalFS
java.io.FileNotFoundException: LocalFS
        at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
        at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
        at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
        at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
        at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
        at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
060523 121814 task_0006_r_000123_0 copy failed: task_0006_m_046105_0 from node5:50040
java.net.SocketTimeoutException: timed out waiting for rpc response
        at org.apache.hadoop.ipc.Client.call(Client.java:305)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
        at org.apache.hadoop.mapred.$Proxy2.getFile(Unknown Source)
        at org.apache.hadoop.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:112)
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:67)
060523 121814 task_0006_r_000123_0 0.13023989% reduce > copy > task_0006_m_046105_0@node5:50040
060523 121814 task_0006_r_000123_0 Copying task_0006_m_048815_0 output from node6
060523 121817 SEVERE Can't open map output:/hadoop/mapred/local/task_0006_m_031921_0/part-152.out
java.io.FileNotFoundException: LocalFS
        at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
        at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
        at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
        at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
        at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
        at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
060523 121817 Unknown child with bad map output: task_0006_m_031921_0. Ignored.
060523 121817 Server handler 1 on 50040 caught: java.io.FileNotFoundException: LocalFS
java.io.FileNotFoundException: LocalFS
        at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
        at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
        at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
        at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
        at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
        at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
060523 121914 task_0006_r_000123_0 copy failed: task_0006_m_048815_0 from node6:50040
java.net.SocketTimeoutException: timed out waiting for rpc response
        at org.apache.hadoop.ipc.Client.call(Client.java:305)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
        at org.apache.hadoop.mapred.$Proxy2.getFile(Unknown Source)
        at org.apache.hadoop.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:112)
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:67)


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message