hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3954) OEV can generate a corrupt edits log
Date Wed, 19 Sep 2012 14:20:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13458713#comment-13458713

Daryn Sharp commented on HDFS-3954:

bq. I don't think it makes sense to write out the edit log in the latest format, and slap
the version number of an older format on it-- which is what you seem to be suggesting in your
JIRA description.

Apologies if I was unclear.  Converting from binary -> xml preserves the older version
in the xml, but converting from xml -> binary will ignore the xml's version and always
write the latest layout version in the binary.  It's inconsistent behavior that I thought
should be consistent.

bq. If you want to edit log in a particular format, the easiest thing to do is probably to
keep around a version of Hadoop that corresponds to that format and run oev from there [...]
We could try to support writing old-style edit logs in the new code, but I think that this
would bitrot extremely quickly

Agreed.  I didn't think the process looked very compatible (although {{OfflineEditsVisitor#start(version)}}
implies the tools were designed to be compatible) but I was just going for consistency w/o
introducing an "incompatibility".  I'm perfectly fine with the tool throwing an exception
if the input log does not have the current layout version.  Are you ok with that?

bq. What problem in particular did you encounter?

I found this while tinkering with a layout change.  A test failed in {{TestOfflineEditsViewer}}
that verifies a binary -> xml -> binary conversion produces an identical binary edits
log.  The files were identical sans the version, but the new file was "corrupt".
> OEV can generate a corrupt edits log
> ------------------------------------
>                 Key: HDFS-3954
>                 URL: https://issues.apache.org/jira/browse/HDFS-3954
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node, tools
>    Affects Versions: 2.0.0-alpha, 3.0.0
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
> The offline edits viewer can output in various formats such as xml and binary.  Xml output
correctly preserves the input's layout version.  Binary output always writes the latest layout
version into the header.  Converting an older layout to xml and back may result in a corrupt
edits log.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message