hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2569) DN decommissioning quirks
Date Sun, 20 Nov 2011 04:44:51 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13153698#comment-13153698

Harsh J commented on HDFS-2569:

The write error is pretty misleading for this case as well:

11/11/20 10:12:41 WARN hdfs.DFSClient: DataStreamer Exception
java.io.IOException: File /user/harshchouraria/gridmix._COPYING_ could only be replicated
to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and no node(s)
are excluded in this operation.

1 running, no exclusions, yet I could replicate only to 0. Perhaps decomm behavior has changed?
We do not disconnect nodes immediately?
> DN decommissioning quirks
> -------------------------
>                 Key: HDFS-2569
>                 URL: https://issues.apache.org/jira/browse/HDFS-2569
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.23.0
>            Reporter: Harsh J
>            Assignee: Harsh J
> Decommissioning a node is working slightly odd in 0.23+:
> The steps I did:
> - Start HDFS via {{hdfs namenode}} and {{hdfs datanode}}. 1-node cluster.
> - Zero files/blocks, so I go ahead and exclude-add my DN and do {{hdfs dfsadmin -refreshNodes}}
> - I see the following log in NN tails, which is fine:
> {code}
> 11/11/20 09:28:10 INFO util.HostsFileReader: Setting the includes file to 
> 11/11/20 09:28:10 INFO util.HostsFileReader: Setting the excludes file to build/test/excludes
> 11/11/20 09:28:10 INFO util.HostsFileReader: Refreshing hosts (include/exclude) list
> 11/11/20 09:28:10 INFO util.HostsFileReader: Adding to the list of hosts
from build/test/excludes
> {code}
> - However, DN log tail gets no new messages. DN still runs.
> - The dfshealth.jsp page shows this table, which makes no sense -- why is there 1 live
and 1 dead?:
> |Live Nodes|1 (Decommissioned: 1)|
> |Dead Nodes|1 (Decommissioned: 0)|
> |Decommissioning Nodes|0|
> - The live nodes page shows this, meaning DN is still up and heartbeating but is decommissioned:
> |Node|Last Contact|Admin State|
> ||0|Decommissioned|
> - The dead nodes page shows this, and the link to the DN is broken cause the port is
linked as -1. Also, showing 'false' for decommissioned makes no sense when live node page
shows that it is already decommissioned:
> |Node|Decommissioned|
> ||false|
> Investigating if this is a quirk only observed when the DN had 0 blocks on it in sum

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message