hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Hoffman <ste...@goofy.net>
Subject hadoop 0.20 migration for JobClient?
Date Fri, 20 Aug 2010 21:39:49 GMT
I'm migrating some code from the hadoop 0.18 apis to the 0.20 apis.
The Mapper/Reducer interfaces in the mapred package to extending the
Mapper/Reducer classes in the mapreduce package is pretty straight
forward.

It appears that Job replaces JobClient/JobConf/etc. and you simply
call submit() to do a submit and return (similar to
JobClient.submitJob() did).
However, after submit() is called on Job, a call to getJobID() return
null.  This seems very wrong...  How do I know what the JobID is?

Originally, I thought it was a problem running it in a unit test with
a local job tracker, but the problem happens on a networked job
tracker.
Using the network job tracker, I can verify that the job does get
submitted.  However, it isn't clear from the documentation how you can
get access to Job state (did it succeed, how far has the mapper run?)
if you don't know what the jobID.

Assuming once I do get a jobID back (so I know what to ask the job
tracker for), what do I use?  Job?  Doesn't seem like I can recreate
Job from a JobID in the same way I could lookup a RunningJob from
JobClient via getJob(jobID)

I'm using the cloudera CDH2 distro if that helps in "what version are
you using?" questions...

Thanks in advance for advice/help!

Steve

Mime
View raw message