hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HDFS-6821) Atomicity of multi file operations
Date Thu, 07 Aug 2014 19:16:13 GMT

     [ https://issues.apache.org/jira/browse/HDFS-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Nauroth resolved HDFS-6821.
---------------------------------

    Resolution: Won't Fix

Hi, [~samera].

Ideas similar to this have been proposed several times.  The consensus has always been that
pushing a recursive operation all the way to the NameNode for atomicity would impact throughput
too severely.  The implementation would require holding the write lock while updating every
inode in a subtree.  During that time, all other RPC caller threads would block waiting for
release of the write lock.  A finer-grained locking implementation would help mitigate this,
but it wouldn't eliminate the problem completely.

It's typical behavior in many file systems that recursive operations are driven from user
space, and the syscalls modify a single inode at a time.  HDFS isn't different in this respect.

I'm going to resolve this as won't fix.

> Atomicity of multi file operations
> ----------------------------------
>
>                 Key: HDFS-6821
>                 URL: https://issues.apache.org/jira/browse/HDFS-6821
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Samer Al-Kiswany
>            Priority: Minor
>
> Looking how HDFS updates the log files in case of chmod –r or chown –r operations.
In these operations, HDFS name node seems to update each file separately; consequently the
strace of the operation looks as follows.
> append(edits)
> fsync(edits)
> append(edits)
> fsync(edits)
> -----------------------
> append(edits)
> fsync(edits)
> append(edits)
> fsync(edits)
> If a crash happens in the middle of this operation (e.g. at the dashed line in the trace),
the system will end up with part of the files updates with the new owner or permissions and
part still with the old owner.
> Isn’t it better to log the whole operations (chown -r) as one entry in the edit file?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message