hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3657) HDFS writes get stuck trying to recoverBlock
Date Tue, 01 Jul 2008 20:50:46 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609703#action_12609703
] 

Raghu Angadi commented on HADOOP-3657:
--------------------------------------

> The "java.io.IOException: Connection reset by peer" is very easy to reproduce. Promote
this to a 0.18 blocker.
These messages can be ignored. Filed HADOOP-3678 to ignore these.

> HDFS writes get stuck trying to recoverBlock
> --------------------------------------------
>
>                 Key: HADOOP-3657
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3657
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Arun C Murthy
>            Assignee: Raghu Angadi
>            Priority: Blocker
>             Fix For: 0.18.0
>
>
> A few reduces got stuck in a sort500 job with the following thread dump:
> {noformat}
> "main" prio=10 tid=0x0805b800 nid=0x1951 waiting for monitor entry [0xf7e6d000..0xf7e6e1f8]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>   at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2485)
>   - waiting to lock <0xe905e8f8> (a java.util.LinkedList)
>   - locked <0xe905e928> (a org.apache.hadoop.dfs.DFSClient$DFSOutputStream)
>   at org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:155)
>   at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
>   - locked <0xe905e928> (a org.apache.hadoop.dfs.DFSClient$DFSOutputStream)
>   at org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
>   - locked <0xe905e928> (a org.apache.hadoop.dfs.DFSClient$DFSOutputStream)
>   at org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:58)
>   - locked <0xe905e928> (a org.apache.hadoop.dfs.DFSClient$DFSOutputStream)
>   at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:39)
>   at java.io.DataOutputStream.writeInt(DataOutputStream.java:181)
>   at org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1014)
>   - locked <0xe90889e8> (a org.apache.hadoop.io.SequenceFile$Writer)
>   at org.apache.hadoop.mapred.SequenceFileOutputFormat$1.write(SequenceFileOutputFormat.java:70)
>   at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:298)
>   at org.apache.hadoop.mapred.lib.IdentityReducer.reduce(IdentityReducer.java:39)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:316)
>   at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2157)
> "DataStreamer for file /rw/out/_temporary/_attempt_200806261801_0006_r_000712_0/part-00712
block blk_-3923696991063961587_9628" daemon prio=10 tid=0x08413c00 nid=0x367a in Object.wait()
[0xd00e4000..0xd00e4f20]
>    java.lang.Thread.State: WAITING (on object monitor)
>   at java.lang.Object.wait(Native Method)
>   at java.lang.Object.wait(Object.java:485)
>   at org.apache.hadoop.ipc.Client.call(Client.java:701)
>   - locked <0xf167d540> (a org.apache.hadoop.ipc.Client$Call)
>   at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>   at org.apache.hadoop.dfs.$Proxy2.recoverBlock(Unknown Source)
>   at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2186)
>   at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1737)
>   at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1891)
>   - locked <0xe905e8f8> (a java.util.LinkedList)
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message