hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan A. Pendleton" ...@geekdom.net>
Subject Re: Indexed SequenceFile
Date Fri, 28 Jul 2006 21:56:21 GMT
It's in the code, called "MapFile".

You can generate them automatically as output from a mapreduce job by using
"MapFileOutputFormat" instead of "SequenceFileOutputFormat".

Make sure you only create a MapFile when the data being written is already
sorted, of course. MapFile works with the assumption that the data is in
sorted order.

On 7/28/06, Benjamin Reed <breed@yahoo-inc.com> wrote:
> I have heard a rumor about the existence of an indexed SequenceFile that
> is
> basically a normal SequenceFile with an associated small index file with
> list
> of offsets to a subset of the keys in the SequenceFile. The index is
> created
> as a sorted SequenceFile is written.
> We have need of an IndexedSequenceFile so any pointers would be helpful.
> ben

Bryan A. P. Pendleton
Ph: (877) geek-1-bp

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message