Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0EFC71072B for ; Mon, 6 Jan 2014 23:15:52 +0000 (UTC) Received: (qmail 486 invoked by uid 500); 6 Jan 2014 23:15:51 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 447 invoked by uid 500); 6 Jan 2014 23:15:51 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 438 invoked by uid 99); 6 Jan 2014 23:15:51 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jan 2014 23:15:51 +0000 Date: Mon, 6 Jan 2014 23:15:51 +0000 (UTC) From: "Suresh Srinivas (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-5704) Change OP_UPDATE_BLOCKS with a new OP_ADD_BLOCK MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13863610#comment-13863610 ] Suresh Srinivas commented on HDFS-5704: --------------------------------------- Comments on the early patch: # Changes in DFSTestUtil, TestOfflineEditsViewer seem unnecessary. If you are removing unused import, can you please only make that change in DFSTestUtil instead of large set of changes that it currently has? # " current persisted block size is" -> " total block count ". Also move the debug log out of writeLock. # AddBlockOp class members should be private. Why implement BlockListUpdatingOp, especially given getBlocks() returns only the last two blocks? # Is blockOffset better name than oldBlkOffset? Adding a javadoc for the parameter will help. Also the local variables from offset to blockOffset used where the updateBlocks() method is called. I also feel that just having a variant of updateBlocks() with addBlock() would make code easier to understand, even if duplicated a bit of code. > Change OP_UPDATE_BLOCKS with a new OP_ADD_BLOCK > ------------------------------------------------ > > Key: HDFS-5704 > URL: https://issues.apache.org/jira/browse/HDFS-5704 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Suresh Srinivas > Assignee: Jing Zhao > Attachments: HDFS-5704.000.patch > > > Currently every time a block a allocated, the entire list of blocks are written in the editlog in OP_UPDATE_BLOCKS operation. This has n^2 growth issue. The total size of editlog records for a file with large number of blocks could be huge. > The goal of this jira is discuss adding a different editlog record that only records allocation of block and not the entire block list, on every block allocation. -- This message was sent by Atlassian JIRA (v6.1.5#6160)