From hdfs-issues-return-32850-apmail-hadoop-hdfs-issues-archive=hadoop.apache.org@hadoop.apache.org Thu Jan 26 00:23:01 2012 Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D0C119188 for ; Thu, 26 Jan 2012 00:23:01 +0000 (UTC) Received: (qmail 72673 invoked by uid 500); 26 Jan 2012 00:23:01 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 72430 invoked by uid 500); 26 Jan 2012 00:23:01 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 72411 invoked by uid 99); 26 Jan 2012 00:23:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jan 2012 00:23:00 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jan 2012 00:22:59 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 96F92163273 for ; Thu, 26 Jan 2012 00:22:39 +0000 (UTC) Date: Thu, 26 Jan 2012 00:22:39 +0000 (UTC) From: "Aaron T. Myers (Commented) (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1569527989.78900.1327537359620.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <709322931.12914.1325814339308.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-2759) Pre-allocate HDFS edit log files after writing version number MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-2759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13193466#comment-13193466 ] Aaron T. Myers commented on HDFS-2759: -------------------------------------- Sorry, I should have mentioned what I did for testing. No tests are included since writing a test seems like it would require adding a test hook in the EditLogFileOutputStream#flushAndSync method, which seems undesirable since that method is highly performance-sensitive. I verified that the bug exists by adding a "System.exit(0)" after pre-allocation, but before calling FileChannel#force. This indeed resulted in a file full of FF, which if an NN were to try to read would be interpreted as an invalid header version number. By moving the pre-allocation after the call to FileChannel#force, we guarantee that the valid data will hit the edit log before writing FFs for the purpose of pre-allocation. In the course of preparing and testing this patch, I think I've discovered another potential issue, though. Note that in the call to FileChannel#force, we pass "false" which indicates that FileChannel#force does not need to wait for metadata of the file to be synced to disk, only the data. I'm not sure of the precise semantics of not syncing metadata. In particular, is the file length included? If not, I think this has the potential to cause some edits to not actually be read back from disk after a crash. The explanation, per the comment, is that syncing metadata is unnecessary because of pre-allocation. I don't think that's reasonable, though, since EditLogFileOutputStream#preallocate doesn't call sync itself, which means that the file length might never get updated upon returning from EditLogFileOutputStream#flushAndSync. Do people agree this is a potential problem? If so, I can either fix it here, or file a new JIRA. > Pre-allocate HDFS edit log files after writing version number > ------------------------------------------------------------- > > Key: HDFS-2759 > URL: https://issues.apache.org/jira/browse/HDFS-2759 > Project: Hadoop HDFS > Issue Type: Bug > Components: ha, name-node > Affects Versions: 0.24.0 > Reporter: Aaron T. Myers > Assignee: Aaron T. Myers > Attachments: HDFS-2759.patch > > > In HDFS-2709 it was discovered that there's a potential race wherein edits log files are pre-allocated before the version number is written into the header of the file. This can cause the NameNode to read an invalid HDFS layout version, and hence fail to read the edit log file. We should write the header, then pre-allocate the rest of the file after this point. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira