hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3645) Improve the way we do detection of a busy DN in the cluster, when choosing it for a block write
Date Thu, 12 Jul 2012 18:21:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413034#comment-13413034
] 

Harsh J commented on HDFS-3645:
-------------------------------

Hi Suresh,

Thank you, your question got me thinking some more. I filed this JIRA as a thought dump from
some thoughts I was having, going through the policy impl. at present.

Sorry for lack of clarification. Let me explain the case I imagine may exist with this specific
check:

# node.getXceiverCount() is a total 'socket' count. It includes writes, _and_ reads.
# Consider a cluster situation such as this when computing the average (may sound a little
hypothetical in this explanation but a near enough case is possible in some situations): 100
DNs are present. Average is about 250 but there are possibly some (very few) nodes with much
higher xceiver counts, at about 600-800. A likely possibility for such a state is that these
nodes are probably serving a very hot, local-block region (a bad HBase case, but quite plausible).
# Now consider that this DN wanted to get a block allocated to it. We computed xceiver average,
and found it to be, 250, and then we checked node count, it was 700. 700 > 250 leads to
it not getting selected, due to us ignoring the fact that most of the "700" were actually
reads and not writes. Perhaps it may have been OK to do a write in this case, if we knew the
ratio of reads:writes aside of count(reads+writes) on the DN?

I've not seen any major issues with this way of write selection at all, but it does seem to
expose a certain edge case. Do you think we should account for such a scenario, or let it
be as-is and continue to keep the load count aggregated? If not, let us close this out.
                
> Improve the way we do detection of a busy DN in the cluster, when choosing it for a block
write
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3645
>                 URL: https://issues.apache.org/jira/browse/HDFS-3645
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Harsh J
>            Priority: Minor
>
> Right now, I think we do too naive a computation for detecting if a chosen DN target
is busy by itself. We currently do {{node.getXceiverCount() > (2.0 * avgLoad)}}.
> We should improve on this computation with a more realistic measure of if a DN is really
busy by itself or not (rather than checking against cluster average, where there's a good
chance the value can be wrong to compare with, for some cases)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message