hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3498) Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass
Date Wed, 20 Jun 2012 03:22:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13397230#comment-13397230
] 

Junping Du commented on HDFS-3498:
----------------------------------

Hey Nicholas,
   Just update patch in HADOOP-8472 which is the implementation of ReplicaPlacementPolicy
for VM case. Sorry for the confusing to take without the VM implementation part. 
   That's great questions, below is my reply:
   *  How are you going to use LocalityGroup in the VM case?
   Current replica removal policy try to make replica robust (after removing) in rack level,
so it split replica nodes into two categories according to their rack. I think algorithm is
general for other cases so I try to separate the rack-specific part to getLocalityGroupForSplit()
which can be overridden easily to address some other cases that we may need other locality
group other than rack to play as failure group.
In VM case, I think we still need rack level to play as robust group so just override getRack()
but not getLocalityGroupForSplit() in VM implementation. So getLocalityGroupForSplit() just
make it extensible for future requirements. If you think it is unnecessary, I am ok to delete
it but override getRack().
 
   * Are you going to override pickupReplicaSet(..) in the vm implementation? If yes, how?
   Yes. It will divide the first set (nodes with other replica living in the same rack) into
two sub categories: one set includes nodes with other replica in the same nodegroup, the other
set includes remaining. A little overhead added to the whole algorithm, but still keep linear
as before.

   * Do you know what does "priSet" stand for? It does not seem a good name not me.
   Sorry. I just reuse the old terminology in previous code. it stands for the first category
nodes that there are other replicas living in the same rack so removing 1 of these nodes will
not reduce the racks of a replica. May be we can go with something like: rackReplicatedNodes?
Any suggestion here?
    
Thanks,

Junping
                
> Make ReplicaPlacementPolicyDefault extensible for reuse code in subclass
> ------------------------------------------------------------------------
>
>                 Key: HDFS-3498
>                 URL: https://issues.apache.org/jira/browse/HDFS-3498
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 1.0.0, 2.0.0-alpha
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: HDFS-3498.patch, Hadoop-8471-BlockPlacementDefault-extensible.patch
>
>
> ReplicaPlacementPolicy is already a pluggable component in Hadoop. A user specified ReplicaPlacementPolicy
can be specified in the hdfs-site.xml configuration under the key "dfs.block.replicator.classname".
However, to make it possible to reuse code in ReplicaPlacementPolicyDefault a few of its methods
were changed from private to protected. ReplicaPlacementPolicy and BlockPlacementPolicyDefault
are currently annotated with @InterfaceAudience.Private.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message