hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Kimball <akimbal...@gmail.com>
Subject Re: Need 0.20.2 new API documentation/examples, where are they?
Date Wed, 06 Apr 2011 20:27:13 GMT
Simplest answer:

Job A uses o.a.h.mapreduce.lib.output.SequenceFileOutputFormat
It writes values to that (using context.write()) of classes KT, VT
Use o.a.h.mapreduce.lib.output.FileOutputFormat.setOutputPath(job, new
Path("job-a-out")); to configure the job to write to some location.

Then run job.waitForCompletion(true);
If the job succeeds (the above returns true), then run Job B:

Job jobB = new Job();
Job B uses FileInputFormat.addInputPath(jobB, new Path("job-a-out"); // Job
A's out is Job B's in.

Job B's mapper will then receive (K, V) arguments with classes KT and VT

Hope this helps...
- Aaron

On Thu, Mar 31, 2011 at 12:11 AM, Amareshwari Sri Ramadasu <
amarsri@yahoo-inc.com> wrote:

>  John,
> Examples and libraries are rewritten to use new api in branch 0.21. You can
> have a look at them.
> New api in branch 0.20 is not stable yet. And old api is undeprecated in
> branch 0.21. So, you can use old api still.
> Thanks
> Amareshwari
> On 3/30/11 11:38 PM, "John Therrell" <jtherrell@gmail.com> wrote:
> I'm looking to get acquainted with the new API in 0.20.2 but all the online
> documentation I've found uses the old API.
> I need to understand how to chain two mapreduce jobs together efficiently
> that must run sequentially. I'd like to use the SequenceFileOutputFormat -->
> SequenceFileInputFormat configuration between my two MapReduce jobs.
> I would be so grateful for any help or links to relevant
> documentation/examples.
> Thanks,
> John

View raw message