hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Best practices for jobs with large Map output
Date Fri, 15 Apr 2011 16:34:07 GMT
Hello Shai,

On Fri, Apr 15, 2011 at 5:45 PM, Shai Erera <serera@gmail.com> wrote:
> The job is an indexing job. Each Mapper emits a small index and the Reducer
> merges all of those indexes together. The Mappers output the index as a
> Writable which serializes it. I guess I could write the Reducer's function
> as a separate class as you suggest, but then I'll need to write a custom
> OutputFormat that will put those indexes on HDFS or somewhere?

I was thinking of a simple Java program that works with HDFS there,
not a Map/Reduce one (although you can tweak Map-only jobs a bit to
run a single mapper alone, which can then go ahead and do the same).
Your Mapper can open one out-file, get a list of all previous job's
output files, and perform the merge reading them one by one. This
would bypass using a Reduce phase.

> That complicates matters for me -- currently, when this job is run as part
> of a sequence of jobs, I can guarantee that if the job succeeds, then the
> indexes are successfully merged, and if it fails, the job should be
> restarted. While that can be achieved with a separate FS-using program as
> you suggest, it complicates matters.

I agree that the suggestion could complicate your workflow a bit.
Although, it is doable by Map-only job as I mentioned right above
(which may make it a bit more acceptable?).

> Is my scenario that extreme? Would you say the common scenario for Hadoop
> are jobs that output tiny objects between Mappers and Reducers?
> Would this work much better if I work w/ several Reducers? I'm not sure it
> will because the problem lies, IMO, in Hadoop allocating large consecutive
> chunks of RAM in my case, instead of trying to either stream it or break it
> down to smaller chunks.

Large outputs are alright, but I wouldn't say they are alright for
simple merging since it would all go through the sort phase with about
twice the I/O ultimately. Using multiple reducers can help a bit if
you do not mind partitioned results at the end.

> Is there absolutely no way to bypass the shuffle + sort phases? I don't mind
> writing some classes if that's what it takes ...

Shuffle is an essential part of the Map to Reduce transition, it can't
be 'bypassed' since a Reducer has to fetch all map-outputs to begin
with. Sort/Group may be made dummy as you had done, but can't be
disabled altogether AFAIK. The latter has been bought up on the lists
before, if I remember right; but am not aware of an implementation
alongside that could do that (just begin reducing merely partitioned,
unsorted data).

Harsh J

View raw message