hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1800) Extend image checksumming to function with multiple fsimage files
Date Fri, 01 Apr 2011 21:11:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014824#comment-13014824
] 

Todd Lipcon commented on HDFS-1800:
-----------------------------------

I can think of three different ways to do this:

1) Next to each fsimage_N file, we also write a file fsimage_N.md5sum which contains the checksum.

2) In the image directory, we write out a single file (eg "md5sums") that has the format:
{code}
<md5sum> fsimage_A
<md5sum> fsimage_B
...
{code}

This would be usable with {{md5sum -c}} for example to verify all the files in the directory.
When a new one needed to be appended, the file would have to be read and re-written.

3) We incorporate the md5sum into the fsimage format itself as a tag at the end of the file.
I don't like this option much, because there's no way for ops to verify it, but including
it as an option for completeness.


Any preference between option #1 and #2? Or a 4th option that I didn't consider?

> Extend image checksumming to function with multiple fsimage files
> -----------------------------------------------------------------
>
>                 Key: HDFS-1800
>                 URL: https://issues.apache.org/jira/browse/HDFS-1800
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>
>
> HDFS-903 added the MD5 checksum of the fsimage to the VERSION file in each image directory.
This allows it to verify that the FSImage didn't get corrupted or accidentally replaced on
disk.
> With HDFS-1073, there may be multiple fsimage_N files in a storage directory corresponding
to different checkpoints. So having a single MD5 in the VERSION file won't suffice. Instead
we need to store an MD5 per image file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message