hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer
Date Fri, 30 Dec 2011 21:14:30 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177761#comment-13177761
] 

Todd Lipcon commented on HDFS-2709:
-----------------------------------

Aaron and I chatted offline about the above questions a little bit. We think the following
is the best route forward:
- Instead of adding a new constructor to ELFIS, add a new "seekToTxnId" method which FileJournalManager
can call after constructing it. (the reasoning being that this is more similar to the normal
Java FileInputStream which has a separate seek() call)
- In FSEditLogLoader, we decided that the custom exception would make the most sense -- i.e
wrap the {{readOp}} call in a {{try/catch}} which would rethrow the exception with some kind
of new {{EditLogInputException}}. The new exception would also have a getter to determine
how many txns were successfully applied prior to the error. This is similar to how InterruptedIOException
works in the standard library.
- Regarding tests, the suggestion was to add some new test cases to {{TestFileJournalManager}}
to exercise the new code in {{selectInputStreams}}.
                
> HA: Appropriately handle error conditions in EditLogTailer
> ----------------------------------------------------------
>
>                 Key: HDFS-2709
>                 URL: https://issues.apache.org/jira/browse/HDFS-2709
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Todd Lipcon
>            Assignee: Aaron T. Myers
>            Priority: Critical
>         Attachments: HDFS-2709-HDFS-1623.patch, HDFS-2709-HDFS-1623.patch, HDFS-2709-HDFS-1623.patch
>
>
> Currently if the edit log tailer experiences an error replaying edits in the middle of
a file, it will go back to retrying from the beginning of the file on the next tailing iteration.
This is incorrect since many of the edits will have already been replayed, and not all edits
are idempotent.
> Instead, we either need to (a) support reading from the middle of a finalized file (ie
skip those edits already applied), or (b) abort the standby if it hits an error while tailing.
If "a" isn't simple, let's do "b" for now and come back to 'a' later since this is a rare
circumstance and better to abort than be incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message