Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Sun, 16 Mar 2014 02:43:43 +0000 (UTC)
From: "Guo Ruijing (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12700701.1394547374264.77409.1394937823924@arcas>
In-Reply-To: <JIRA.12700701.1394547374264@arcas>
References: <JIRA.12700701.1394547374264@arcas>
Subject: [jira] [Commented] (HDFS-6087) Unify HDFS write/append/truncate
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/HDFS-6087?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D13936=
383#comment-13936383 ]=20

Guo Ruijing commented on HDFS-6087:
-----------------------------------

writing not in block boundary will trigger block copying in DN:

1) it won't lead to a lot of small block
2) Like most of file system, hflush/hsync/truncate may cause performance do=
wngrade.

If we can design zero copy for block copy, there is little performance down=
grade.

1) Block is defined as (block data file, block length)
2) source block is already committed to NN and immutable.
3) block file can be created/appended and cannot be overridden or truncated=
.
4) Block size may not be equal to block data file length
5) create hardlink for block data file if copy block length =3D file length
6) copy block data file if copy block length < file length

Example:

1) Block 1:  (blockfile1, 32M) blockfile1(length: 32M)
2) copy Block 1 to Block 2 with 32M

a) hardlink blockfile 1 to blockfile 2.
b) Block 2: (blockfile2, 32M) blockfile2 (length: 32M)

3) write 16M buffer to block 2

a) Block 1:  (blockfile1, 32M) blockfile1(length: 48M)
  =20
b) Block 2:  (blockfile2, 48M) blockfile2(length: 48M)

3) copy Block 2 to Block 3 with 16M

a) copy blockfile2 to blockfile3 with 16M

b) Block 1:  (blockfile1, 32M) blockfile1(length: 48M)
  =20
c) Block 2:  (blockfile2, 48M) blockfile2(length: 48M)

d) block 3: (blockfile 3, 16M) blockfile3(length: 16M)

> Unify HDFS write/append/truncate
> --------------------------------
>
>                 Key: HDFS-6087
>                 URL: https://issues.apache.org/jira/browse/HDFS-6087
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>            Reporter: Guo Ruijing
>         Attachments: HDFS Design Proposal.pdf, HDFS Design Proposal_3_14.=
pdf
>
>
> In existing implementation, HDFS file can be appended and HDFS block can =
be reopened for append. This design will introduce complexity including lea=
se recovery. If we design HDFS block as immutable, it will be very simple f=
or append & truncate. The idea is that HDFS block is immutable if the block=
 is committed to namenode. If the block is not committed to namenode, it is=
 HDFS client=E2=80=99s responsibility to re-added with new block ID.


--
This message was sent by Atlassian JIRA
(v6.2#6252)