Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Mon, 6 Jan 2014 23:15:51 +0000 (UTC)
From: "Suresh Srinivas (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12686660.1388302530246.50715.1389050151707@arcas>
In-Reply-To: <JIRA.12686660.1388302530246@arcas>
References: <JIRA.12686660.1388302530246@arcas>
Subject: [jira] [Commented] (HDFS-5704) Change OP_UPDATE_BLOCKS  with a new
 OP_ADD_BLOCK
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863610#comment-13863610 ] 

Suresh Srinivas commented on HDFS-5704:
---------------------------------------

Comments on the early patch:
# Changes in DFSTestUtil, TestOfflineEditsViewer seem unnecessary. If you are removing unused import, can you please only make that change in DFSTestUtil instead of large set of changes that it currently has?
# " current persisted block size is" -> " total block count ". Also move the debug log out of writeLock.
# AddBlockOp class members should be private. Why implement BlockListUpdatingOp, especially given getBlocks() returns only the last two blocks?
# Is blockOffset better name than oldBlkOffset? Adding a javadoc for the parameter will help. Also the local variables from offset to blockOffset used where the updateBlocks() method is called. I also feel that just having a variant of updateBlocks() with addBlock() would make code easier to understand, even if duplicated a bit of code.


> Change OP_UPDATE_BLOCKS  with a new OP_ADD_BLOCK
> ------------------------------------------------
>
>                 Key: HDFS-5704
>                 URL: https://issues.apache.org/jira/browse/HDFS-5704
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Suresh Srinivas
>            Assignee: Jing Zhao
>         Attachments: HDFS-5704.000.patch
>
>
> Currently every time a block a allocated, the entire list of blocks are written in the editlog in OP_UPDATE_BLOCKS operation. This has n^2 growth issue. The total size of editlog records for a file with large number of blocks could be huge.
> The goal of this jira is discuss adding a different editlog record that only records allocation of block and not the entire block list, on every block allocation.


--
This message was sent by Atlassian JIRA
(v6.1.5#6160)