hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Rosenstrauch <dar...@darose.net>
Subject Re: Preferred way to submit a job?
Date Thu, 12 Aug 2010 02:27:21 GMT
On 08/11/2010 08:08 PM, Aaron Kimball wrote:
> On Wed, Aug 11, 2010 at 3:13 PM, David Rosenstrauch<darose@darose.net>wrote:
>> What's the preferred way to submit a job these days?
>> org.apache.hadoop.mapreduce.Job.submit() ?  Or
>> org.apache.hadoop.mapred.JobClient.runJob()?  Or does it even matter? (i.e.,
>> is there any difference between them?)
> If you're using the old API (e.g., you're filling out o.a.h.mapred.JobConf,
> and implementing o.a.h.mapred.Mapper) then you use JobClient.runJob(). If
> you're using the new API (o.a.h.mapreduce.Job, o.a.h.mapreduce.Mapper), then
> you use Job.waitForCompletion().
> You can't mix'n'match; your job has to be entirely "old style" or entirely
> "new style." Some programs use one, some use the other.

OK, so I'm not insane then.  :-)  That's how I thought it worked.

>> On a related note, if there's actually no difference between the 2 methods,
>> would anybody have any idea what could make the "mapred.job.tracker" setting
>> on a job Configuration get ignored?  (I currently have it set to
>> "hdfs://<hadoop_job_tracker_host_name>:9001".)
> There's a reason that's being ignored :) That is not a jobtracker address.
> Assuming you've configured your namenode and your jobtracker on the same
> machine, then your fs.default.name should be hdfs://hdfs.host.name:port, and
> mapred.job.tracker should just be jt.host.name:port
> The port numbers in these two cases will be different.

Hmmmm ... OK.  Not sure I understand why the syntax is different for 
thosee 2 settings, but I'll give that a shot and see if it fixes the 

Thanks much for the help!


View raw message