hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4600) HDFS file append failing in multinode cluster
Date Thu, 14 Mar 2013 02:16:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13601968#comment-13601968
] 

Colin Patrick McCabe commented on HDFS-4600:
--------------------------------------------

Just a guess, but do you have a 2-node cluster, and replication set to 3?  (I checked the
XML conf files you attached and didn't see any evidence of replication set to something other
than the default of 3).

The decision to throw an exception when we can't get up to the full replication factor was
a design decision as I understand-- thought it has led to some (in my opinion) counter-intuitive
behavior.
                
> HDFS file append failing in multinode cluster
> ---------------------------------------------
>
>                 Key: HDFS-4600
>                 URL: https://issues.apache.org/jira/browse/HDFS-4600
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.3-alpha
>            Reporter: Roman Shaposhnik
>            Priority: Blocker
>             Fix For: 2.0.4-alpha
>
>         Attachments: core-site.xml, hdfs-site.xml, X.java
>
>
> NOTE: the following only happens in a fully distributed setup (core-site.xml and hdfs-site.xml
are attached)
> Steps to reproduce:
> {noformat}
> $ javac -cp /usr/lib/hadoop/client/\* X.java
> $ echo aaaaa > a.txt
> $ hadoop fs -ls /tmp/a.txt
> ls: `/tmp/a.txt': No such file or directory
> $ HADOOP_CLASSPATH=`pwd` hadoop X /tmp/a.txt
> 13/03/13 16:05:14 WARN hdfs.DFSClient: DataStreamer Exception
> java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to
no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010],
original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement
policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration.
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
> Exception in thread "main" java.io.IOException: Failed to replace a bad datanode on the
existing pipeline due to no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010,
10.80.134.126:50010], original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed
datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration.
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
> 13/03/13 16:05:14 ERROR hdfs.DFSClient: Failed to close file /tmp/a.txt
> java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to
no more good datanodes being available to try. (Nodes: current=[10.10.37.16:50010, 10.80.134.126:50010],
original=[10.10.37.16:50010, 10.80.134.126:50010]). The current failed datanode replacement
policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy'
in its configuration.
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:793)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:858)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:964)
> 	at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:470)
> {noformat}
> Given that the file actually does get created:
> {noformat}
> $ hadoop fs -ls /tmp/a.txt
> Found 1 items
> -rw-r--r--   3 root hadoop          6 2013-03-13 16:05 /tmp/a.txt
> {noformat}
> this feels like a regression in APPEND's functionality.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message