hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2018) 1073: Move all journal stream management code into one place
Date Wed, 31 Aug 2011 19:54:11 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094843#comment-13094843
] 

jiraposter@reviews.apache.org commented on HDFS-2018:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1688/#review1703
-----------------------------------------------------------


In doing this review, I remembered some reasoning about why I wanted to introduce the EditLogReference
interface:

1) The lifecycle of the open edit streams is now unbalanced -- the streams are all opened
in one place and then closed far away in loadEdits() later. If there's an error in one of
the middle streams, it will leak the open files of the other streams that come after it. Introducing
EditLogReference (a reference to a stream that's not opened yet) means that loadEdits(EditLogReference)
has the whole lifecycle of an open stream.
2) The old implementation had true unit tests for the various gap scenarios, which I found
easy to read. The new tests are much harder to read, with the "AbortSpec" stuff, plus it relies
on actually creating real files on disk, rather than mock objects. Having the FileJournalManager
only deal with references in the "planning" state, and moving the validation to the reference,
means we can test all of this code with only mocks.

Specific comments on the implementation below.


hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
<https://reviews.apache.org/r/1688/#comment3858>

    If you pass "target - 1" above, then why do we have to explicitly remove it again? It
already should be excluded here.



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
<https://reviews.apache.org/r/1688/#comment3859>

    shouldn't there be exactly 1 stream here?
    
    What happens if you don't find any? there should be a check for stream == null



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java
<https://reviews.apache.org/r/1688/#comment3860>

    add a comment that this JM is never used for input, hence it can never provide transactions?



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
<https://reviews.apache.org/r/1688/#comment3862>

    adjust javadoc to explain what "best" means. In this case, it's the one who has the most
transactions readable after that txid



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
<https://reviews.apache.org/r/1688/#comment3861>

    should log a warning



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
<https://reviews.apache.org/r/1688/#comment3863>

    this error path leaks streams



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
<https://reviews.apache.org/r/1688/#comment3864>

    while we're at it, we can add a good check here that lastTxId == INVALID_TXID || op.txid
== lastTxid + 1 - log an ERROR and break the loop if that's hit, I think?



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
<https://reviews.apache.org/r/1688/#comment3865>

    I don't think this logic makes sense - in the case of a journal manager going away and
coming back inside the BN or SBN, you may want to finalize old segments while writing to a
new one. That is to say, this should be independent of currentInProgress.



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
<https://reviews.apache.org/r/1688/#comment3866>

    warning message should include the storage dir



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
<https://reviews.apache.org/r/1688/#comment3867>

    don't you also need to break, here? Otherwise you will improperly log all larger logs
as gaps



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
<https://reviews.apache.org/r/1688/#comment3868>

    what's the purpose of this helper function?



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
<https://reviews.apache.org/r/1688/#comment3869>

    is this ever called?



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
<https://reviews.apache.org/r/1688/#comment3870>

    Add javadoc that this mutates the structure - it's not clear from the name of the function.



hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
<https://reviews.apache.org/r/1688/#comment3871>

    please use American spelling to match the rest of Hadoop



hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java
<https://reviews.apache.org/r/1688/#comment3872>

    What would happen if the txid file hadn't been updated? Can we add a test for the case
where we have an out-of-date seen_txid file, but we do have new logs? i.e:
    
    fsimage_0
    edits_1-10
    edits_inprogress_11 (which 5 txns)
    seen_txid = 0
    
    Do we properly read all 15 txns? Do we have enough safeguards that we won't overwrite
edits or do anything overlapping?
    
    In the current implementation (ie pre-patch), it would see the newer logs and know that
it should attempt to read them. In this implementation, I think it might overwrite one of
the segments.
    
    Please add a test case for this.
    
    
    


- Todd


On 2011-08-31 19:04:11, Todd Lipcon wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1688/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-31 19:04:11)
bq.  
bq.  
bq.  Review request for Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Posting patch on review-board for easy review
bq.  
bq.  
bq.  This addresses bug HDFS-2018.
bq.      http://issues.apache.org/jira/browse/HDFS-2018
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupImage.java
d72509c 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/BackupJournalManager.java
3cac667 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/Checkpointer.java
f754100 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogBackupInputStream.java
8921bc0 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogFileInputStream.java
532b2f2 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/EditLogInputStream.java
52a3dd4 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
495c42e 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
db98569 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
8b25901 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImagePreTransactionalStorageInspector.java
cec2eef 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageStorageInspector.java
65bfa0a 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java
0814a14 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FileJournalManager.java
991d7f5 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/JournalManager.java
d62aaa7 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSRollback.java
511e9c1 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/FSImageTestUtil.java
4a8edb8 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestCheckPointForSecurityTokens.java
4f698c0 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java
9e55946 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestEditLogFileOutputStream.java
a673c5f 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSEditLogLoader.java
a53a0bf 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFSImageStorageInspector.java
113dcbc 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestFileJournalManager.java
748caf4 
bq.    hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameEditsConfigs.java
31a7777 
bq.  
bq.  Diff: https://reviews.apache.org/r/1688/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Todd
bq.  
bq.



> 1073: Move all journal stream management code into one place
> ------------------------------------------------------------
>
>                 Key: HDFS-2018
>                 URL: https://issues.apache.org/jira/browse/HDFS-2018
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 0.23.0
>
>         Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, hdfs-2018-otherapi.txt,
hdfs-2018.txt
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in FileJournalManager
and the code for input streams is in the inspectors. This change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of transactions instead
of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message