hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Lewis <lordjoe2...@gmail.com>
Subject Re: Big split file to Partitioner
Date Sun, 22 Aug 2010 17:47:15 GMT
It is a good idea to ask what the meaning of ths split is. Typically a split
is one per line but I have written splits
which return the entire file for a small file - say an xml document

Combiners are special and add the outputs of splits - these are only
occasionally used and when the output is combinable - usually summanbe

On Sat, Aug 21, 2010 at 10:29 AM, Pedro Costa <psdc1978@gmail.com> wrote:

> Hi,
> 1 - I'm running the wordcount examples with one input file with size
> of 50Mb and with 2 reduces defined. At the end of the execution of the
> wordcount, the 2 reduces deals with each part of the input file. For
> example, one reduce gets one part of the split file, and the other
> reduce gets the other part producing two output files. Why the split
> file content is divided?  This is the case where the partitioner
> concept enters?
> 2 - It also means that it's produced 2 map outputs, where each map
> output is sent to each reducer?
> Thanks
> --

Steven M. Lewis PhD
Institute for Systems Biology
Seattle WA

View raw message