hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sameer Paranjpye (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-244) very long cleanup after a job fails
Date Wed, 31 May 2006 00:19:31 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-244?page=all ]

Sameer Paranjpye updated HADOOP-244:
------------------------------------

    Fix Version: 0.4
        Version: 0.2

> very long cleanup after a job fails
> -----------------------------------
>
>          Key: HADOOP-244
>          URL: http://issues.apache.org/jira/browse/HADOOP-244
>      Project: Hadoop
>         Type: Bug

>   Components: mapred
>     Versions: 0.2
>     Reporter: Yoram Arnon
>     Assignee: Sameer Paranjpye
>      Fix For: 0.4

>
> Eight hours after a job failed (it executed for about 14 hours prior to failing), many
task trackers keep throwing the exceptions below:
> 060523 121732 Server handler 0 on 50040 caught: java.io.FileNotFoundException: LocalFS
> java.io.FileNotFoundException: LocalFS
>         at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
>         at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
>         at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
>         at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
>         at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
>         at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
> 060523 121814 task_0006_r_000123_0 copy failed: task_0006_m_046105_0 from node5:50040
> java.net.SocketTimeoutException: timed out waiting for rpc response
>         at org.apache.hadoop.ipc.Client.call(Client.java:305)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
>         at org.apache.hadoop.mapred.$Proxy2.getFile(Unknown Source)
>         at org.apache.hadoop.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:112)
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:67)
> 060523 121814 task_0006_r_000123_0 0.13023989% reduce > copy > task_0006_m_046105_0@node5:50040
> 060523 121814 task_0006_r_000123_0 Copying task_0006_m_048815_0 output from node6
> 060523 121817 SEVERE Can't open map output:/hadoop/mapred/local/task_0006_m_031921_0/part-152.out
> java.io.FileNotFoundException: LocalFS
>         at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
>         at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
>         at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
>         at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
>         at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
>         at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
> 060523 121817 Unknown child with bad map output: task_0006_m_031921_0. Ignored.
> 060523 121817 Server handler 1 on 50040 caught: java.io.FileNotFoundException: LocalFS
> java.io.FileNotFoundException: LocalFS
>         at org.apache.hadoop.fs.LocalFileSystem.openRaw(LocalFileSystem.java:123)
>         at org.apache.hadoop.fs.FSDataInputStream$Checker.<init>(FSDataInputStream.java:46)
>         at org.apache.hadoop.fs.FSDataInputStream.<init>(FSDataInputStream.java:228)
>         at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:157)
>         at org.apache.hadoop.mapred.MapOutputFile.write(MapOutputFile.java:116)
>         at org.apache.hadoop.io.ObjectWritable.writeObject(ObjectWritable.java:151)
>         at org.apache.hadoop.io.ObjectWritable.write(ObjectWritable.java:64)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:230)
> 060523 121914 task_0006_r_000123_0 copy failed: task_0006_m_048815_0 from node6:50040
> java.net.SocketTimeoutException: timed out waiting for rpc response
>         at org.apache.hadoop.ipc.Client.call(Client.java:305)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:150)
>         at org.apache.hadoop.mapred.$Proxy2.getFile(Unknown Source)
>         at org.apache.hadoop.mapred.ReduceTaskRunner.prepare(ReduceTaskRunner.java:112)
>         at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:67)

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message