hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Parks" <davidpark...@yahoo.com>
Subject RE: Tricks to upgrading Sequence Files?
Date Wed, 30 Jan 2013 02:17:18 GMT
I'll consider a patch to the SequenceFile, if we could manually override the
sequence file input Key and Value that's read from the sequence file headers
we'd have a clean solution.

I don't like versioning my Model object because it's used by 10's of other
classes and I don't want to risk less maintained classes continuing to use
an old version.

For the time being I just used 2 jobs. First I renamed the old Model Object
to the original name, read it in, upgraded it, and wrote the new version
with a different class name.

Then I renamed the classes again so the new model object used the original
name and read in the altered name and cloned it into the original name.

All in all an hours work only, but having a cleaner process would be better.
I'll add the request to JIRA at a minimum.


-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com] 
Sent: Wednesday, January 30, 2013 2:32 AM
To: <user@hadoop.apache.org>
Subject: Re: Tricks to upgrading Sequence Files?

This is a pretty interesting question, but unfortunately there isn't an
inbuilt way in SequenceFiles itself to handle this. However, your key/value
classes can be made to handle versioning perhaps - detecting if what they've
read is of an older time and decoding it appropriately (while handling newer
encoding separately, in the normal fashion).
This would be much better than going down the classloader hack paths I

On Tue, Jan 29, 2013 at 1:11 PM, David Parks <davidparks21@yahoo.com> wrote:
> Anyone have any good tricks for upgrading a sequence file.
> We maintain a sequence file like a flat file DB and the primary object 
> in there changed in recent development.
> It's trivial to write a job to read in the sequence file, update the 
> object, and write it back out in the new format.
> But since sequence files read and write the key/value class I would 
> either need to rename the model object with a version number, or 
> change the header of each sequence file.
> Just wondering if there are any nice tricks to this.

Harsh J

View raw message