Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-issues@hadoop.apache.org
Date: Mon, 22 Oct 2012 17:42:13 +0000 (UTC)
From: "Chuan Liu (JIRA)" <jira@apache.org>
To: common-issues@hadoop.apache.org
Message-ID: <1520193950.10671.1350927733731.JavaMail.jiratomcat@arcas>
In-Reply-To: <626811464.10293.1341514234905.JavaMail.jiratomcat@issues-vm>
Subject: [jira] [Updated] (HADOOP-8564) Port and extend Hadoop native
 libraries for Windows to address datanode concurrent reading and writing
 issue
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


     [ https://issues.apache.org/jira/browse/HADOOP-8564?page=3Dcom.atlassi=
an.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chuan Liu updated HADOOP-8564:
------------------------------

    Attachment: HADOOP-8564-branch-1-win-newfiles.patch

I forgot to include two new Windows build files. Attach a new patch of the =
two missing files.
               =20
> Port and extend Hadoop native libraries for Windows to address datanode c=
oncurrent reading and writing issue
> -------------------------------------------------------------------------=
-----------------------------------
>
>                 Key: HADOOP-8564
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8564
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 1-win
>            Reporter: Chuan Liu
>            Assignee: Chuan Liu
>             Fix For: 1-win
>
>         Attachments: HADOOP-8564-branch-1-win-newfiles.patch, HADOOP-8564=
-branch-1-win.patch, HADOOP-8564-branch-1-win.patch
>
>
> HDFS files are made up of blocks. First, let=E2=80=99s look at writing. W=
hen the data is written to datanode, an active or temporary file is created=
 to receive packets. After the last packet for the block is received, we wi=
ll finalize the block. One step during finalization is to rename the block =
file to a new directory. The relevant code can be found via the call sequen=
ce: FSDataSet.finalizeBlockInternal -> FSDir.addBlock.
> {code}=20
>         if ( ! metaData.renameTo( newmeta ) ||
>             ! src.renameTo( dest ) ) {
>           throw new IOException( "could not move files for " + b +
>                                  " from tmp to " +=20
>                                  dest.getAbsolutePath() );
>         }
> {code}
> Let=E2=80=99s then switch to reading. On HDFS, it is expected the client =
can also read these unfinished blocks. So when the read calls from client r=
each datanode, the datanode will open an input stream on the unfinished blo=
ck file.
> The problem comes in when the file is opened for reading while the datano=
de receives last packet from client and try to rename the finished block fi=
le. This operation will succeed on Linux, but not on Windows .  The behavio=
r can be modified on Windows to open the file with FILE_SHARE_DELETE flag o=
n, i.e. sharing the delete (including renaming) permission with other proce=
sses while opening the file. There is also a Java bug ([id 6357433|http://b=
ugs.sun.com/bugdatabase/view_bug.do?bug_id=3D6357433]) reported a while bac=
k on this. However, since this behavior exists for Java on Windows since JD=
K 1.0, the Java developers do not want to break the backward compatibility =
on this behavior. Instead, a new file system API is proposed in JDK 7.
> As outlined in the [Java forum|http://www.java.net/node/645421] by the Ja=
va developer (kbr), there are three ways to fix the problem:
> # Use different mechanism in the application in dealing with files.
> # Create a new implementation of InputStream abstract class using Windows=
 native code.
> # Patch JDK with a private patch that alters FileInputStream behavior.
> For the third option, it cannot fix the problem for users using Oracle JD=
K.
> We discussed some options for the first approach. For example one option =
is to use two phase renaming, i.e. first hardlink; then remove the old hard=
link when read is finished. This option was thought to be rather pervasive.=
  Another option discussed is to change the HDFS behavior on Windows by not=
 allowing client reading unfinished blocks. However this behavior change is=
 thought to be problematic and may affect other application build on top of=
 HDFS.
> For all the reasons discussed above, we will use the second approach to a=
ddress the problem.
> If there are better options to fix the problem, we would also like to hea=
r about them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrato=
rs
For more information on JIRA, see: http://www.atlassian.com/software/jira