Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C9FC5D3B5 for ; Tue, 25 Jun 2013 13:56:20 +0000 (UTC) Received: (qmail 348 invoked by uid 500); 25 Jun 2013 13:56:20 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 315 invoked by uid 500); 25 Jun 2013 13:56:20 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 171 invoked by uid 99); 25 Jun 2013 13:56:20 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Jun 2013 13:56:20 +0000 Date: Tue, 25 Jun 2013 13:56:20 +0000 (UTC) From: "Jihoon Son (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-4931) Extend the block placement policy interface to utilize the location information of previously stored files MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13693047#comment-13693047 ] Jihoon Son commented on HDFS-4931: ---------------------------------- Thanks for your comments. I'll think more about this idea. > Extend the block placement policy interface to utilize the location information of previously stored files > ------------------------------------------------------------------------------------------------------------ > > Key: HDFS-4931 > URL: https://issues.apache.org/jira/browse/HDFS-4931 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Jihoon Son > Attachments: HDFS-4931.patch > > > Nowadays, I'm implementing a locality preserving block placement policy which stores files in a directory in the same datanode. That is to say, given a root directory, files under the root directory are grouped by paths of their parent directories. After that, files of a group are stored in the same datanode. > When a new file is stored at HDFS, the block placement policy choose the target datanode considering locations of previously stored files. > In the current block placement policy interface, there are some problems. The first problem is that there is no interface to keep the previously stored files when HDFS is restarted. To restore the location information of all files, this process should be done during the safe mode of the namenode. > To solve the first problem, I modified the block placement policy interface and FSNamesystem. Before leaving the safe mode, every necessary location information is sent to the block placement policy. > However, there are too much changes of access modifiers from private to public in my implementation. This may violate the design of the interface. > The second problem is occurred when some blocks are moved by the balancer or node failures. In this case, the block placement policy should recognize the current status, and return a new datanode to move blocks. However, the current interface does not support it. > The attached patch is to solve the first problem, but as mentioned above, it may violate the design of the interface. > Do you have any good ideas? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira