Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-dev@hadoop.apache.org
Date: Thu, 7 Aug 2014 19:16:13 +0000 (UTC)
From: "Chris Nauroth (JIRA)" <jira@apache.org>
To: hdfs-dev@hadoop.apache.org
Message-ID: <JIRA.12732082.1407260565572.36790.1407438973490@arcas>
In-Reply-To: <JIRA.12732082.1407260565572@arcas>
References: <JIRA.12732082.1407260565572@arcas>
Subject: [jira] [Resolved] (HDFS-6821) Atomicity of multi file operations
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


     [ https://issues.apache.org/jira/browse/HDFS-6821?page=3Dcom.atlassian=
.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Nauroth resolved HDFS-6821.
---------------------------------

    Resolution: Won't Fix

Hi, [~samera].

Ideas similar to this have been proposed several times.  The consensus has =
always been that pushing a recursive operation all the way to the NameNode =
for atomicity would impact throughput too severely.  The implementation wou=
ld require holding the write lock while updating every inode in a subtree. =
 During that time, all other RPC caller threads would block waiting for rel=
ease of the write lock.  A finer-grained locking implementation would help =
mitigate this, but it wouldn't eliminate the problem completely.

It's typical behavior in many file systems that recursive operations are dr=
iven from user space, and the syscalls modify a single inode at a time.  HD=
FS isn't different in this respect.

I'm going to resolve this as won't fix.

> Atomicity of multi file operations
> ----------------------------------
>
>                 Key: HDFS-6821
>                 URL: https://issues.apache.org/jira/browse/HDFS-6821
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Samer Al-Kiswany
>            Priority: Minor
>
> Looking how HDFS updates the log files in case of chmod =E2=80=93r or cho=
wn =E2=80=93r operations. In these operations, HDFS name node seems to upda=
te each file separately; consequently the strace of the operation looks as =
follows.
> append(edits)
> fsync(edits)
> append(edits)
> fsync(edits)
> -----------------------
> append(edits)
> fsync(edits)
> append(edits)
> fsync(edits)
> If a crash happens in the middle of this operation (e.g. at the dashed li=
ne in the trace), the system will end up with part of the files updates wit=
h the new owner or permissions and part still with the old owner.
> Isn=E2=80=99t it better to log the whole operations (chown -r) as one ent=
ry in the edit file?


--
This message was sent by Atlassian JIRA
(v6.2#6252)