hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1801) Remove use of timestamps to identify checkpoints and logs
Date Thu, 07 Apr 2011 03:38:07 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13016671#comment-13016671
] 

Todd Lipcon commented on HDFS-1801:
-----------------------------------

bq. Would a txid ever be available in this backup context?

Yea, the BackupNode will have some concept of lastAppliedTxId, but I am putting off Backupnode
work to later in the branch. It will be easier to come back to it after the primary NN is
all in good shape, I think.

bq. Why you do Preconditions sometimes and assert elsewhere? I'd think you'd do one or the
other?

So far I've been trying to use the following policy:
- if the code is performance-sensitive, use assert
- if the assertion is nearly guaranteed to be true by virtue of other logic in the same function,
use assert
- if the condition should be true, but it's by some higher-order construct (eg the open/closed
state of the edit log), use Preconditions.

I don't know if this makes any good sense, or if I've been very consistent. I will try to
swing through and clean up.

bq. Will we need tests where we check migration, where we prove the new code can open the
old-style fsimages and edit files?
Yes, good idea. I will open another subtask to make sure we cover this.

bq. The txid is a long as was the ts so Serialization doesn't change; the number of bytes
doesn't change, just their interpretation? Is it possible that we'll deserialize an old style
NameNodeRegistration and interpret a ts as a txid?

We're safe since the NNRegistration is only used via RPC, and we'll increment the RPC version
in the branch. That way an old node won't be able to register to a new one, and we don't have
to worry about the changed writable.

> Remove use of timestamps to identify checkpoints and logs
> ---------------------------------------------------------
>
>                 Key: HDFS-1801
>                 URL: https://issues.apache.org/jira/browse/HDFS-1801
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: hdfs-1801.txt
>
>
> Currently, the NameNode validates checkpoint uploads by using timestamps associated with
checkpoints and edit logs. However, now that we have transaction IDs that uniquely identify
each point in time in the history of a namespace, it is more robust to simply use transaction
IDs to identify images and edits.
> This JIRA is to remove the use of editsTime and checkpointTime and replace it with:
> * {{lastCheckpointTxId}} - the highest transaction ID reflected in the most recently
saved fsimage file
> * {{lastLogRollTxId}} - the highest transaction ID in {{edits}} when {{rollFsImage}}
was called by the checkpointing node.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message