hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-hadoop Wiki] Update of "HadoopMapReduce" by TeppoKurki
Date Wed, 19 Apr 2006 05:03:54 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by TeppoKurki:

  When an individual !MapTask task starts it will open a new output
  writer per configured Reduce task. It will then proceed to read
  its !FileSplit using the !RecordReader it gets from the specified
- InputFormat. !InputFormat parses the input and generates
+ !InputFormat. !InputFormat parses the input and generates
  key-value pairs. It is not necessary for the !InputFormat to
  generate both "meaningful" keys and values. For example the
  default !TextInputFormat's output consists of input lines as
@@ -31, +31 @@

  passed to the configured Mapper. The user supplied Mapper does
  whatever it wants with the input pair and calls	[http://lucene.apache.org/hadoop/docs/api/org/apache/hadoop/mapred/OutputCollector.html#collect(org.apache.hadoop.io.WritableComparable,%20org.apache.hadoop.io.Writable)
OutputCollector.collect] with key-value pairs of its own choosing. The output it
  generates must use one key class and one value class, because
- the Map output will be eventually written into a SequenceFile,
+ the Map output will be eventually written into a !SequenceFile,
  which has per file type information and all the records must
  have the same type (use subclassing if you want to output
  different data structures). The Map input and output key-value

View raw message