hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1521) Persist transaction ID on disk between NN restarts
Date Sat, 11 Dec 2010 03:23:01 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970371#action_12970371

Todd Lipcon commented on HDFS-1521:

The reasons I chose to use the txid in FSEditLog rather than coopt generation stamp were:
- We already had txid in FSEditLog, so it seemed strange to increment this as well as the
generation stamp
- Incrementing FSN's generation stamp would add another cyclic dependency between FSEditLog
back to the core of the NN, which we're trying to eliminate in other JIRAs.

As for deciding to put the txid in the header rather than on every record, I could go either
way. I went with just the header because doing it in every record adds 8 bytes per edit, which
would probably be 10% or so extra space overhead (likely causing 10% extra time spent loading
the image too). I didn't benchmark it, but without any particular benefit, it didn't seem
like it was worth the penalty.

One compromise might be to periodically add a "sync" record which includes the current transaction
ID and perhaps some kind of magic number, kind of like what SequenceFile does. This would
be handy for repair processes or even for running MR jobs on edit logs some day. Thoughts?

> Persist transaction ID on disk between NN restarts
> --------------------------------------------------
>                 Key: HDFS-1521
>                 URL: https://issues.apache.org/jira/browse/HDFS-1521
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: 0.22.0
>         Attachments: hdfs-1521.txt, hdfs-1521.txt
> For HDFS-1073 and other future work, we'd like to have the concept of a transaction ID
that is persisted on disk with the image/edits. We already have this concept in the NameNode
but it resets to 0 on restart. We can also use this txid to replace the _checkpointTime_ field,
I believe.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message