hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tsz Wo (Nicholas), SZE (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1332) When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
Date Tue, 17 May 2011 00:00:47 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13034436#comment-13034436

Tsz Wo (Nicholas), SZE commented on HDFS-1332:

- Just found one problem: In the two {{chooseRandom(..)}} methods (line 333-363 and line 378-422),
if the first {{FSNamesystem.LOG.isDebugEnabled()}} return false and the second {{FSNamesystem.LOG.isDebugEnabled()}}
returns true, we will have a {{NullPointerException}} since {{builder}} is null.  We should
check null for the second if-statement.

- There are repeated "Not able to place enough replicas" in the log message.  I think we might
have {{new NotEnoughReplicasException(detail)}} and then use {{"\n" + e}} instead of {{"\n"
+ e.getMessage()}} in {{LOG.warn(..)}}.  Then, the log will become something like
2011-05-16 16:37:04,230 WARN  namenode.FSNamesystem (BlockPlacementPolicyDefault.java:chooseTarget(212))
- Not able to place enough replicas, still in need of 1 to reach 2
NotEnoughReplicasException: [ Node /default-rack/ is not chosen
because the node is (being) decommissioned ]

> When unable to place replicas, BlockPlacementPolicy should log reasons nodes were excluded
> ------------------------------------------------------------------------------------------
>                 Key: HDFS-1332
>                 URL: https://issues.apache.org/jira/browse/HDFS-1332
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Todd Lipcon
>            Assignee: Ted Yu
>            Priority: Minor
>              Labels: newbie
>             Fix For: 0.23.0
>         Attachments: HDFS-1332-concise.patch
> Whenever the block placement policy determines that a node is not a "good target" it
could add the reason for exclusion to a list, and then when we log "Not able to place enough
replicas" we could say why each node was refused. This would help new users who are having
issues on pseudo-distributed (eg because their data dir is on /tmp and /tmp is full). Right
now it's very difficult to figure out the issue.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message