hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jitendra Nath Pandey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2018) 1073: Move all journal stream management code into one place
Date Wed, 10 Aug 2011 02:37:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13082105#comment-13082105
] 

Jitendra Nath Pandey commented on HDFS-2018:
--------------------------------------------

A few comments:
  getFirstTxnId and getLastTxnId seem a bit redundant in the EditLogInputStream interface.
The first txnid must be the last read plus one. The last txnid can be obtained using getNumberOfTransactions.

  Similarly in the constructor of EditLogFileInputStream. It might lead to inconsistent use
of EditLogFileInputStream, for example if the file contents don't match the transaction ids
being passed.

> what's the story on exclusive locking of journal managers after this? Do we still assume
that they'll be locked by FSImage? Or should we push 
> down locking to the journal managers as well?
 I think JournalManagers should not be assumed to be locked by FSImage. FSEditLog should manage
the synchronization for JournalManagers where they interact with namesystem.

> not sure I follow why getNumberOfTransactions throws a CorruptionException if you've
asked for a transaction ID in the middle of a gap. Isn't 
> it more consistent to just return 0? ie that journal just wasn't being written to for
that txid?
 Is it intended to distinguish between the cases when there is gap or if the journal doesn't
have any more transactions? It may be useful to distinguish between the two, but I agree that
CorruptionException is not very convincing, because the journal is not really corrupt, it
can still serve valid ranges of transactions.

> 1073: Move all journal stream management code into one place
> ------------------------------------------------------------
>
>                 Key: HDFS-2018
>                 URL: https://issues.apache.org/jira/browse/HDFS-2018
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ivan Kelly
>            Assignee: Ivan Kelly
>             Fix For: 0.23.0
>
>         Attachments: HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff,
HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff, HDFS-2018.diff
>
>
> Currently in the HDFS-1073 branch, the code for creating output streams is in FileJournalManager
and the code for input streams is in the inspectors. This change does a number of things.
>   - Input and Output streams are now created by the JournalManager.
>   - FSImageStorageInspectors now deals with URIs when referring to edit logs
>   - Recovery of inprogress logs is performed by counting the number of transactions instead
of looking at the length of the file.
> The patch for this applies on top of the HDFS-1073 branch + HDFS-2003 patch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message