Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E129211187 for ; Thu, 7 Aug 2014 19:16:14 +0000 (UTC) Received: (qmail 11768 invoked by uid 500); 7 Aug 2014 19:16:13 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 11673 invoked by uid 500); 7 Aug 2014 19:16:13 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 11614 invoked by uid 99); 7 Aug 2014 19:16:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 07 Aug 2014 19:16:13 +0000 Date: Thu, 7 Aug 2014 19:16:13 +0000 (UTC) From: "Chris Nauroth (JIRA)" To: hdfs-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (HDFS-6821) Atomicity of multi file operations MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-6821?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Nauroth resolved HDFS-6821. --------------------------------- Resolution: Won't Fix Hi, [~samera]. Ideas similar to this have been proposed several times. The consensus has = always been that pushing a recursive operation all the way to the NameNode = for atomicity would impact throughput too severely. The implementation wou= ld require holding the write lock while updating every inode in a subtree. = During that time, all other RPC caller threads would block waiting for rel= ease of the write lock. A finer-grained locking implementation would help = mitigate this, but it wouldn't eliminate the problem completely. It's typical behavior in many file systems that recursive operations are dr= iven from user space, and the syscalls modify a single inode at a time. HD= FS isn't different in this respect. I'm going to resolve this as won't fix. > Atomicity of multi file operations > ---------------------------------- > > Key: HDFS-6821 > URL: https://issues.apache.org/jira/browse/HDFS-6821 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Samer Al-Kiswany > Priority: Minor > > Looking how HDFS updates the log files in case of chmod =E2=80=93r or cho= wn =E2=80=93r operations. In these operations, HDFS name node seems to upda= te each file separately; consequently the strace of the operation looks as = follows. > append(edits) > fsync(edits) > append(edits) > fsync(edits) > ----------------------- > append(edits) > fsync(edits) > append(edits) > fsync(edits) > If a crash happens in the middle of this operation (e.g. at the dashed li= ne in the trace), the system will end up with part of the files updates wit= h the new owner or permissions and part still with the old owner. > Isn=E2=80=99t it better to log the whole operations (chown -r) as one ent= ry in the edit file? -- This message was sent by Atlassian JIRA (v6.2#6252)