hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1984) HDFS-1073: Enable multiple checkpointers to run simultaneously
Date Mon, 23 May 2011 22:39:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038288#comment-13038288

Todd Lipcon commented on HDFS-1984:

Currently this test scenario fails after a few seconds with an exception like:

11/05/23 15:25:46 WARN mortbay.log: /getimage: java.io.IOException: GetImage failed. java.io.IOException:
Namenode has an edit log corresponding to txid 1240 but new checkpoint was created using editlog
ending at txid 1238. Checkpoint Aborted.
        at org.apache.hadoop.hdfs.server.namenode.FSImage.validateCheckpointUpload(FSImage.java:894)
        at org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:107)
        at org.apache.hadoop.hdfs.server.namenode.GetImageServlet$1.run(GetImageServlet.java:80)

but this validation is bogus. So long as no two checkpointers try to upload a checkpoint at
the same txid, it's OK if they upload "old" fsimages.

To fix this, I think we need to do the following:

a) Repurpose the "checkpointTxId" field of FSImage. This currently tracks the last txid at
which the NN has either saved or uploaded a checkpoint. We use it to advertise which image
file a checkpointer should download, but we also use it to validate the checkpoint upload.
Instead, it should be renamed to "mostRecentImageTxId" and only be used to advertise the image.

b) Remove the "imageDigest" field. The function of validation is now being done by an adjacent
".md5" file next to each image. When the checkpointer downloads an image, the image transfer
servlet can just read the .md5 file and include the hash as an HTTP header. The checkpointer
can then verify that it transferred correctly by comparing the image it downloaded against
that md5 hash. When uploading the new checkpoint back to the NN, the same process is used
in reverse.

The new validation rules for accepting a checkpoint upload should be:
- the namespace/clusterid/etc match up (same as today)
- the transaction ID of the uploaded image is less than the current transaction ID of the
namespace (sanity check)
- the hash of the file received matches the hash that the 2NN communicates for a header

> HDFS-1073: Enable multiple checkpointers to run simultaneously
> --------------------------------------------------------------
>                 Key: HDFS-1984
>                 URL: https://issues.apache.org/jira/browse/HDFS-1984
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
> One of the motivations of HDFS-1073 is that it decouples the checkpoint process so that
multiple checkpoints could be taken at the same time and not interfere with each other.
> Currently on the 1073 branch this doesn't quite work right, since we have some state
and validation in FSImage that's tied to a single fsimage_N -- thus if two 2NNs perform a
checkpoint at different transaction IDs, only one will succeed.
> As a stress test, we can run two 2NNs each configured with the fs.checkpoint.interval
set to "0" which causes them to continuously checkpoint as fast as they can.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message