Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 81244 invoked from network); 9 Jul 2009 22:28:29 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Jul 2009 22:28:29 -0000 Received: (qmail 80732 invoked by uid 500); 9 Jul 2009 22:28:39 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 80696 invoked by uid 500); 9 Jul 2009 22:28:39 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 80686 invoked by uid 99); 9 Jul 2009 22:28:39 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jul 2009 22:28:39 +0000 X-ASF-Spam-Status: No, hits=-1998.8 required=10.0 tests=ALL_TRUSTED,FS_REPLICA X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jul 2009 22:28:36 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 563C629A001B for ; Thu, 9 Jul 2009 15:28:15 -0700 (PDT) Message-ID: <645171881.1247178495352.JavaMail.jira@brutus> Date: Thu, 9 Jul 2009 15:28:15 -0700 (PDT) From: "Sanjay Radia (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Commented: (HDFS-385) Design a pluggable interface to place replicas of blocks in HDFS MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729431#action_12729431 ] Sanjay Radia commented on HDFS-385: ----------------------------------- >The usage model it to define a policy for the entire cluster when you create the cluster. This is especially useful when you have an HDFS instance on Amazon EB2 instance for example. This is not intended to be dynamic in any shape or form for a specified cluster. Given the above should the system record the policy in the fsImage to prevent it from being changed? Similarly should the balancer check to see if it has the same policy as the NN? In the past folks have complained that hadoop is too easy to misconfigure. >> I'm a little concerned that the Balancer and Fsck will contradict a policy based on filename because BlockPlacementPolicy.isValidMove() and BlockPlacementPolicy.verifyBlockPlacement() >I agree that there isn't an elegant way of materializing a filename during a rebalance operation. One workaround is for you to run a fsck to find the mapping from blocks to a file. Then you can use this information in your modified Balancer to do what is appropriate for you. You can also use the tool in HADOOP-5019 for this purpose. These are hard problems which indicate that this work is experiemental and that it will be a while before we figure out the right APIs. However the experimentation is useful and as long it does not impact the base code in a negative way, we should be able to add such features to hadoop after careful review. We should mark such new experimental APIs as "unstable" so that we are free to change them down the road. > Design a pluggable interface to place replicas of blocks in HDFS > ---------------------------------------------------------------- > > Key: HDFS-385 > URL: https://issues.apache.org/jira/browse/HDFS-385 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: dhruba borthakur > Assignee: dhruba borthakur > Fix For: 0.21.0 > > Attachments: BlockPlacementPluggable.txt, BlockPlacementPluggable2.txt, BlockPlacementPluggable3.txt, BlockPlacementPluggable4.txt, BlockPlacementPluggable4.txt, BlockPlacementPluggable5.txt > > > The current HDFS code typically places one replica on local rack, the second replica on remote random rack and the third replica on a random node of that remote rack. This algorithm is baked in the NameNode's code. It would be nice to make the block placement algorithm a pluggable interface. This will allow experimentation of different placement algorithms based on workloads, availability guarantees and failure models. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.