On 09/24/2010 11:12 AM, Martin Becker wrote:
> Hi James,
>
> I am trying to avoid to call any command line command. I want to submit
> a job from within a java application. If possible without packing any
> jar file at all. But I guess that will be necessary to allow Hadoop to
> load the specific classes. The tutorial definitely does not contain any
> explicit java code how to do this. Sorry, for not stating my problem
> clearly:
>
> Right now I want to use Eclipse to submit my job by doing using the "Run
> as..." dialog. Later I want to embed that part in a java application
> submitting configured jobs to a remote Hadoop system/cluster.
>
> Regards,
> Martin
This is very do-able. (I do this now.)
Here is a skeleton for how it can be done:
public class JobSubmitter implements Tool {
public static void main(String[] args) throws Exception {
ToolRunner.run(new Configuration(), new JobSubmitter(), args);
}
public JobSubmitter() {
<your code here>
}
public Configuration getConf() {
return appConf;
}
public void setConf(Configuration conf) {
this.appConf = conf;
}
public int run(String[] args) throws Exception {
Job job = new Job(appConf);
Configuration jobConf = job.getConfiguration();
jobConf.set(<your code here>);
<your code here>
job.submit();
}
}
re: "without packing any jar file at all":
If you use Tool/ToolRunner (as we are doing above), that lets your
Hadoop app automatically handle some key command line args. One them
that you will use here is the -libjars argument. If you use -libjars
and specify a list of jars that contain your code, then ToolRunner will
automatically take those jars and put them in the Distributed Cache on
each task node, where they will get added to the classpath of every
map/reduce task.
HTH,
DR
|