Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4ABAB10DE0 for ; Sun, 16 Mar 2014 02:43:45 +0000 (UTC) Received: (qmail 39452 invoked by uid 500); 16 Mar 2014 02:43:44 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 39363 invoked by uid 500); 16 Mar 2014 02:43:44 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 39355 invoked by uid 99); 16 Mar 2014 02:43:44 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Mar 2014 02:43:43 +0000 Date: Sun, 16 Mar 2014 02:43:43 +0000 (UTC) From: "Guo Ruijing (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-6087?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D13936= 383#comment-13936383 ]=20 Guo Ruijing commented on HDFS-6087: ----------------------------------- writing not in block boundary will trigger block copying in DN: 1) it won't lead to a lot of small block 2) Like most of file system, hflush/hsync/truncate may cause performance do= wngrade. If we can design zero copy for block copy, there is little performance down= grade. 1) Block is defined as (block data file, block length) 2) source block is already committed to NN and immutable. 3) block file can be created/appended and cannot be overridden or truncated= . 4) Block size may not be equal to block data file length 5) create hardlink for block data file if copy block length =3D file length 6) copy block data file if copy block length < file length Example: 1) Block 1: (blockfile1, 32M) blockfile1(length: 32M) 2) copy Block 1 to Block 2 with 32M a) hardlink blockfile 1 to blockfile 2. b) Block 2: (blockfile2, 32M) blockfile2 (length: 32M) 3) write 16M buffer to block 2 a) Block 1: (blockfile1, 32M) blockfile1(length: 48M) =20 b) Block 2: (blockfile2, 48M) blockfile2(length: 48M) 3) copy Block 2 to Block 3 with 16M a) copy blockfile2 to blockfile3 with 16M b) Block 1: (blockfile1, 32M) blockfile1(length: 48M) =20 c) Block 2: (blockfile2, 48M) blockfile2(length: 48M) d) block 3: (blockfile 3, 16M) blockfile3(length: 16M) > Unify HDFS write/append/truncate > -------------------------------- > > Key: HDFS-6087 > URL: https://issues.apache.org/jira/browse/HDFS-6087 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs-client > Reporter: Guo Ruijing > Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.= pdf > > > In existing implementation, HDFS file can be appended and HDFS block can = be reopened for append. This design will introduce complexity including lea= se recovery. If we design HDFS block as immutable, it will be very simple f= or append & truncate. The idea is that HDFS block is immutable if the block= is committed to namenode. If the block is not committed to namenode, it is= HDFS client=E2=80=99s responsibility to re-added with new block ID. -- This message was sent by Atlassian JIRA (v6.2#6252)