hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <tdunn...@veoh.com>
Subject Re: single output file
Date Tue, 15 Jan 2008 21:57:33 GMT

This is happening because you have many reducers running, only one of which
gets any data.

Since you have combiners, this probably isn't a problem.  That reducer
should only get as many records as you have maps.  It would be a problem if
your reducer were getting lots of input records.

You can avoid this by setting the number of reducers to 1.

On 1/15/08 1:17 PM, "Vadim Zaliva" <krokodil@gmail.com> wrote:

> Hi!
> I have a novice question. I have data consisting of (Text, Long)
> tuples. I need to calculate sum
> of the values. The way I am achieving it now is mapping Text key to a
> constant value Text("Total") and using
> LongSumReducer as both Combiner and Reducer. It seems to be working
> except that I get many 0-bytes
> output files and just one non-empty file with the actual result. If
> there is a way to avoid creation
> of these empty files? Thanks!
> Vadim

View raw message