hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Job Controller for MapReduce task assignment
Date Mon, 10 Sep 2012 14:12:02 GMT
Hi John,

The key to triggering the reduce() function is the completion of the
map process. So for example, if you wanted to run your MR program for
10 maps alone (10 maps being the checkpoint), wrap the emitting
portion with a configurable value such that the map exits after 10

Something like:

if (checkPointReached) { } else { context.write(K, V); }

Of course, if you are looking to change something on the framework
side itself, I'd recommend looking at alternative approaches such as
Apache S4, Nathan Marz's Storm, or if you insist with using MR for
this, try taking a look at the design of HOP
(http://code.google.com/p/hop/). Are those more suited for what you're
trying to do?

On Mon, Sep 10, 2012 at 10:50 AM, John Cuffney <cuffneyj@gmail.com> wrote:
> Hey,
> That's very helpful, thank you.  I guess to be more clear about what I'm
> doing, I want to have a simulation that runs through the mapping portion of
> the MR, Stops, sets a checkpoint, then runs the reduce portion of the MR.
> So I guess the issue is finding a point in between the Map and Reduce phase.
> Thanks,
> John

Harsh J

View raw message