hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Sautins <andy.saut...@returnpath.net>
Subject Map/Reduce and sequence file metadata...
Date Thu, 01 Oct 2009 16:10:53 GMT

   Hi all. I'm struggling a bit to figure this out and wondering if anyone had any  pointers.

   I'm using SequenceFiles as output from a MapReduce job ( using SequenceFileOutputFormat
) and then in a followup MapReduce job reading in the results using SequenceFileInputFormat.
 All seems to work fine.  What I haven't figured out is how to write the SequenceFile.Metadata
in the SequenceFileOutputFormat and then read the metadata in SequenceFileInputFormat.  Is
that possible to do using the new mapreduce.* API?

   I have two types of files I want to process in the Mapper.  Currently I'm using the  context.getInputSplit()
and parsing the resulting fileSplit.getPath() to determine what file I'm processing.  It seems
cleaner to use the SequenceFile.Metadata if I can.  Does that make sense or am I off in the



  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message