Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 85929 invoked from network); 15 Apr 2010 22:01:19 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 15 Apr 2010 22:01:19 -0000 Received: (qmail 54863 invoked by uid 500); 15 Apr 2010 22:01:19 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 54779 invoked by uid 500); 15 Apr 2010 22:01:19 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 54771 invoked by uid 99); 15 Apr 2010 22:01:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Apr 2010 22:01:19 +0000 X-ASF-Spam-Status: No, hits=-1296.8 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Apr 2010 22:01:18 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o3FM0v3c027415 for ; Thu, 15 Apr 2010 18:00:58 -0400 (EDT) Message-ID: <20857860.152931271368857867.JavaMail.jira@thor> Date: Thu, 15 Apr 2010 18:00:57 -0400 (EDT) From: "Konstantin Shvachko (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Commented: (HDFS-909) Race condition between rollEditLog or rollFSImage ant FSEditsLog.write operations corrupts edits log In-Reply-To: <594234706.373621264010694855.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12857563#action_12857563 ] Konstantin Shvachko commented on HDFS-909: ------------------------------------------ I see it passed Hudson http://hudson.zones.apache.org/hudson/view/Hdfs/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/153/ with one failed test TestDiskError.testLocalDirs. When I run it on my box it does not fail, and it failed on Hudson in some other builds. I don't think this is related to the patch. Todd, this marked for committing to 0.21 and 0.20. Could you please compile the respective patches, and make sure they pass tests on their branches. I will commit it then. > Race condition between rollEditLog or rollFSImage ant FSEditsLog.write operations corrupts edits log > ----------------------------------------------------------------------------------------------------- > > Key: HDFS-909 > URL: https://issues.apache.org/jira/browse/HDFS-909 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0 > Environment: CentOS > Reporter: Cosmin Lehene > Assignee: Todd Lipcon > Priority: Blocker > Fix For: 0.21.0, 0.22.0 > > Attachments: hdfs-909-unittest.txt, hdfs-909.txt, hdfs-909.txt, hdfs-909.txt, hdfs-909.txt, hdfs-909.txt, hdfs-909.txt > > > closing the edits log file can race with write to edits log file operation resulting in OP_INVALID end-of-file marker being initially overwritten by the concurrent (in setReadyToFlush) threads and then removed twice from the buffer, losing a good byte from edits log. > Example: > {code} > FSNameSystem.rollEditLog() -> FSEditLog.divertFileStreams() -> FSEditLog.closeStream() -> EditLogOutputStream.setReadyToFlush() > FSNameSystem.rollEditLog() -> FSEditLog.divertFileStreams() -> FSEditLog.closeStream() -> EditLogOutputStream.flush() -> EditLogFileOutputStream.flushAndSync() > OR > FSNameSystem.rollFSImage() -> FSIMage.rollFSImage() -> FSEditLog.purgeEditLog() -> FSEditLog.revertFileStreams() -> FSEditLog.closeStream() ->EditLogOutputStream.setReadyToFlush() > FSNameSystem.rollFSImage() -> FSIMage.rollFSImage() -> FSEditLog.purgeEditLog() -> FSEditLog.revertFileStreams() -> FSEditLog.closeStream() ->EditLogOutputStream.flush() -> EditLogFileOutputStream.flushAndSync() > VERSUS > FSNameSystem.completeFile -> FSEditLog.logSync() -> EditLogOutputStream.setReadyToFlush() > FSNameSystem.completeFile -> FSEditLog.logSync() -> EditLogOutputStream.flush() -> EditLogFileOutputStream.flushAndSync() > OR > Any FSEditLog.write > {code} > Access on the edits flush operations is synchronized only in the FSEdits.logSync() method level. However at a lower level access to EditsLogOutputStream setReadyToFlush(), flush() or flushAndSync() is NOT synchronized. These can be called from concurrent threads like in the example above > So if a rollEditLog or rollFSIMage is happening at the same time with a write operation it can race for EditLogFileOutputStream.setReadyToFlush that will overwrite the the last byte (normally the FSEditsLog.OP_INVALID which is the "end-of-file marker") and then remove it twice (from each thread) in flushAndSync()! Hence there will be a valid byte missing from the edits log that leads to a SecondaryNameNode silent failure and a full HDFS failure upon cluster restart. > We got to this point after investigating a corrupted edits file that made HDFS unable to start with > {code:title=namenode.log} > java.io.IOException: Incorrect data format. logVersion is -20 but writables.length is 768. > at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:450 > {code} > EDIT: moved the logs to a comment to make this readable -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira