Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C9F782257 for ; Tue, 3 May 2011 09:03:47 +0000 (UTC) Received: (qmail 14443 invoked by uid 500); 3 May 2011 09:03:46 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 14246 invoked by uid 500); 3 May 2011 09:03:46 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 13993 invoked by uid 99); 3 May 2011 09:03:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 May 2011 09:03:46 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 May 2011 09:03:43 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 8FD6CBFE99 for ; Tue, 3 May 2011 09:03:03 +0000 (UTC) Date: Tue, 3 May 2011 09:03:03 +0000 (UTC) From: "Ivan Kelly (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1243638056.18008.1304413383585.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1148888743.26876.1301636285841.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-1799) Refactor log rolling and filename management out of FSEditLog MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028114#comment-13028114 ] Ivan Kelly commented on HDFS-1799: ---------------------------------- {quote} I agree that your design avoids introducing a new class by collapsing the file management (ie "log lifecycle") code into EditLogOutputStream. However, I don't feel like "fewer code changes" or "fewer classes" are particularly appropriate design goals. {quote} More code changes and more classes indicate that more complexity is being added which I think is what is happening here. Instead of FSEditLog being a manager of log objects, it becomes a manager of managers of log objects. More complexity is fine as long as it adds clarity to the design, or the ability to do things previously not possible. I don't think JournalManager adds either. {quote} The nice thing about this design is that EditLogOutputStream needs no conception of StorageDirectories, for example. In fact, it is barely coupled to HDFS at all with the exception of using a few static constants. In contrast, your design collapses layout management into the output stream, thus making EditLogFileOutputStream depend on StorageDirectory, NNStorage, etc. {quote} I don't see any harm in EditLogFileOutputStream knowing about StorageDirectory and NNStorage as long as this doesn't pollute EditLogOutputStream. It is the implementation of EditLogOutputStream for files after all, and files in HDFS should be dealt with through StorageDirectory. My design only collapses the API into the output stream, not the actual implement. #nextSegment() could make a call to a JournalManager/Journal/JournalFactory which would actually do the file management. Collapsing the API isn't that different to what happens with JournalManager anyhow, as all accesses to the stream go through the manager. {quote} Lastly, I want point out that HDFS-1580 will require a class like this anyway (called Journal in your design doc}}. Though this current patch doesn't address it, it will be a clear extension of JournalManager to add the input-side calls, the purging calls, etc. {quote} The difference between JournalManager and Journal/JournalFactory in HDFS-1580 is one of persistence. For output, JournalManager objects must be created and held by FSEditLog for the duration of it's existence. Journal/JournalFactory objects only exists for the creation of the stream after which the object can go out of scope/be GC'd (unless the creates streams keep a reference). {quote} Clearly we have a difference of opinion on this design, but could you please indicate how strong your objections are? i.e. are you -1ing this design or just proposing another option? Given that I already have a bunch of work lined up (and blocked) behind this, I'd really like to close this out in the next day or two. {quote} I'm -1. I really don't like the call to getCurrentStream() having to be called every time you want an output stream. However if the concensus goes the other way I'll acquiesce. > Refactor log rolling and filename management out of FSEditLog > ------------------------------------------------------------- > > Key: HDFS-1799 > URL: https://issues.apache.org/jira/browse/HDFS-1799 > Project: Hadoop HDFS > Issue Type: Sub-task > Affects Versions: Edit log branch (HDFS-1073) > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Fix For: Edit log branch (HDFS-1073) > > Attachments: 0001-Added-state-management-to-FSEditLog.patch, 0002-Standardised-error-pattern.patch, 0003-Add-JournalFactory-and-move-divert-revert-out-of-FSE.patch, HDFS-1799-all.diff, hdfs-1799.txt, hdfs-1799.txt, hdfs-1799.txt, hdfs-1799.txt > > > This is somewhat similar to HDFS-1580, but less ambitious. While that JIRA focuses on pluggability, this task is simply the minimum needed for HDFS-1073: > - Refactor the filename-specific code for rolling, diverting, and reverting log streams out of FSEditLog into a new class > - Clean up the related code in FSEditLog a bit > Notably, this JIRA is going to temporarily break the BackupNode. I plan to circle back on the BackupNode later on this branch. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira