Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D6FFF7623 for ; Thu, 14 Jul 2011 20:55:26 +0000 (UTC) Received: (qmail 27532 invoked by uid 500); 14 Jul 2011 20:55:26 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 27478 invoked by uid 500); 14 Jul 2011 20:55:26 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 27469 invoked by uid 99); 14 Jul 2011 20:55:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jul 2011 20:55:26 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jul 2011 20:55:23 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id B6B01586AA for ; Thu, 14 Jul 2011 20:55:01 +0000 (UTC) Date: Thu, 14 Jul 2011 20:55:01 +0000 (UTC) From: "Todd Lipcon (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <332852820.15061.1310676901745.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-1073) Simpler model for Namenode's fs Image and edit Logs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065526#comment-13065526 ] Todd Lipcon commented on HDFS-1073: ----------------------------------- bq. EditLogFileInputStream doesn't have any change except for an unused import. good catch, fixed the import {quote} EditLogOutputStream.java : abstract void write(byte[] data, int i, int length) All transactions should have a txid, therefore this write method is confusing. {quote} Agreed. This is used by the BackupNode which currently receives only byte arrays which have to be journaled, rather than logical transaction records. I added a javadoc which explains its purpose, and renamed the offset parameter. {quote} What is the reason to persist start and end of log segments? Do we really need OP_START_LOG_SEGMENT and OP_END_LOG_SEGMENT? {quote} I remember discussing this at one point on JIRA, but I can't seem to find the comment. I think it was either Sanjay or Rob Chanselor who had suggested that we later extend these opcodes to have a bit of extra information such as the timestamp, the hostname, the namespace ID, etc. They would serve as extra sanity checks and possibly be useful for debug/audit/etc. Of course right now they don't do a whole lot, but I think they are still useful during "forensics" -- eg when I'm looking at a log file in a hex editor, it would be nice to see one of these transactions at the end to know that it didn't somehow get truncated. Race condition bugs around rolling, like we've seen before, would also be a lot more obvious. bq. LogHeader has a read method but not a write. Will it make sense to encapsulate both read and write of the header in the same class? Agreed - Ivan has opened HDFS-2149 - I'd propose we do that under that JIRA? bq. writeTransactionIdFileToStorage: The transaction id will be persisted along with the image and log files. For a running namenode, it will be in the in-memory state. It is not clear to me why do we need to persist a txid marker separately This was added in HDFS-1801, with the rationale in [this comment|https://issues.apache.org/jira/browse/HDFS-1801?focusedCommentId=13026872&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13026872]. Basically it adds an extra safeguard so that if the last edit logs are somehow lost (or unavailable at startup), the storage directories will have enough info to detect it and prevent the NN from starting. bq. There are unused imports in a few files. Yep, thanks. Attached patch fixes most of them. bq. I have a few concerns related to FSImageTransactionalStorageInspector, FSEditLogLoader, but those parts have been addressed in HDFS-2018. I recommend to commit HDFS-2018 in the branch as it significantly improves some parts of the code. Let's continue to discuss there. I addressed the unused imports and javadoc fixes on the branch in r1146889. > Simpler model for Namenode's fs Image and edit Logs > ---------------------------------------------------- > > Key: HDFS-1073 > URL: https://issues.apache.org/jira/browse/HDFS-1073 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Sanjay Radia > Assignee: Todd Lipcon > Attachments: hdfs-1073-editloading-algos.txt, hdfs-1073.txt, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.pdf, hdfs1073.tex > > > The naming and handling of NN's fsImage and edit logs can be significantly improved resulting simpler and more robust code. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira