hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luke Weber <l...@tuenti.com>
Subject Questions about LayoutVersions.java
Date Fri, 26 Aug 2011 15:39:07 GMT
Hey,

I was curious if anyone had proposed a good solution or future jira to
LayoutVersion/FSEditLog and opcode differences between the different
versions to support better upgrades and downgrades in the future.

I noticed here, https://issues.apache.org/jira/browse/HDFS-1842 it
seems to have already caused some problems in at least a few cases,
and saw other people complaining about security branch vs, later
releases and then the really big discussion here,
https://issues.apache.org/jira/browse/HDFS-1822. The topic seemed to
focus in on the validity and truth of trunk, which I'll try to leave
alone. As far as I can tell, different versions will be rolled,
ideally its all in trunk, but what if it isn't? Yes burning them is
one option, but what if some normal person just wants a downgrade for
some reason, but doesn't have their rollback copy?

Has someone thought of something like this a mapper/reader to support
reading in one format of log that's defined, and then mapping the
relevant fields to the current version. It would probably involve
defining the schema (op_code+fields) in both the supported version for
upgrade and the current versions, which you want to convert to and
then mapping relevant fields/disgarding others, for which you could
give a warning, ie dropping security settings due to downgrade.
Obviously LayoutVersion.supports feature comes closer to this, but
things are starting to look fragile.

Then these types of things might go away from FSEditLogOp.java?:
 570       if (logVersion <= -11) {
 571         this.permissions = PermissionStatus.read(in);
 572       } else {
 573         this.permissions = null;
 574       }

I'm not an expert on the ins and out, and it's probably a big
refactor, but was curious if anyone had thought about this? Might ease
conversion process in general, and specific upgrades would be
supported as defined in one place as opposed to rejected based on
additional patches to force it not being allowed. If schema was
documented in all releases, then conversion would be simplified, or so
the idea goes.

Any takers or ideas on this already been kicked around? If for
anything else, people learning about hadoop have to reformat their
data when they're just testing different versions, which is a shame
for me :)

Luke

Mime
View raw message