hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: What's the best way to get to a single key?
Date Tue, 04 Mar 2008 20:53:17 GMT
Xavier Stevens wrote:
> Is there a way to do this when your input data is using SequenceFile
> compression?

Yes.  A MapFile is simply a directory containing two SequenceFiles named 
"data" and "index".  MapFileOutputFormat uses the same compression 
parameters as SequenceFileOutputFormat.  SequenceFileInputFormat 
recognizes MapFiles and reads the "data" file.  So you should be able to 
just switch from specifying SequenceFileOutputFormat to 
MapFileOutputFormat in your jobs and everything should work the same 
except you'll have index files that permit random access.


View raw message