hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2576) Namenode should have a favored nodes hint to enable clients to have control over block placement.
Date Wed, 27 Feb 2013 22:47:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13588870#comment-13588870
] 

Aaron T. Myers commented on HDFS-2576:
--------------------------------------

bq. From a design point of view would it be possible to just add an attribute or flag on hdfs
files or directories that specify an block affinity group? This would seem cheaper than an
alternative that specifies specific favored dn's lists for block replicas. This would seem
more robust than something specified only at file creation time, and more managable in the
long term if data nodes membership changes over time.

+1, all of this makes sense to me.

bq. in the use case of HBase, region files are rewritten as part of compaction, and that would
again create the blocks in the favored nodes...

True, though I can imagine other uses besides just HBase region files that could benefit from
a feature like this, e.g. some file formats which will be used in joins could benefit from
HDFS trying to place the replicas of a few separate files on the same set of DNs. In that
case we shouldn't assume that the files will be short-lived/rewritten during a compaction
process.

Of course persistence of these hints could be done as a separate JIRA, but we might consider
as part of this JIRA whether the API could be made appropriate for both use cases - long-lived
and short-lived files. For that matter, we might deliberately make these hints non-persistent
in the branch-1 implementation so as to avoid having to bump the edit log version number,
but persistent in the trunk/branch-2 implementation.

bq. but let me get to the next level of detail on that.

Sorry, I don't understand. What do you mean by this?

Also, I realize that the current patch is intended to be WIP, but it also appears to be targeted
at branch-1. Before we can commit this to branch-1, we'll need to have a trunk/branch-2 patch.
                
> Namenode should have a favored nodes hint to enable clients to have control over block
placement.
> -------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-2576
>                 URL: https://issues.apache.org/jira/browse/HDFS-2576
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Pritam Damania
>         Attachments: hdfs-2576-1.txt
>
>
> Sometimes Clients like HBase are required to dynamically compute the datanodes it wishes
to place the blocks for a file for higher level of locality. For this purpose there is a need
of a way to give the Namenode a hint in terms of a favoredNodes parameter about the locations
where the client wants to put each block. The proposed solution is a favored nodes parameter
in the addBlock() method and in the create() file method to enable the clients to give the
hints to the NameNode about the locations of each replica of the block. Note that this would
be just a hint and finally the NameNode would look at disk usage, datanode load etc. and decide
whether it can respect the hints or not.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message