hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From web service <wbs...@gmail.com>
Subject Re: running hadoop jobs from within a program
Date Fri, 12 Nov 2010 16:55:11 GMT
Thanks, but submitting three different jobs say using

JobClient.submitjob(jobconf1);
JobClient.submitjob(jobconf2);
JobClient.submitjob(jobconf3)

different from running -
tmp="$HADOOP_BIN jar $JAR_LOC  $MAIN_CLASS /user/joe/input/input-1/
/user/vadmin/output/output-1/
tmp="$HADOOP_BIN jar $JAR_LOC  $MAIN_CLASS /user/joe/input/input-2/
/user/vadmin/output/output-2/
tmp="$HADOOP_BIN jar $JAR_LOC  $MAIN_CLASS /user/joe/input/input-3/
/user/vadmin/output/output-3/

I guess every job can have specific jvm options. and I hope that every
submitted job runs in a separate jvm, No ?

On Fri, Nov 12, 2010 at 12:55 AM, daniel sikar <dsikar@gmail.com> wrote:

> I suggest you write a loop in your bash script, grepping for finished,
> then take it from there.
> Also, you can submit the same job as many times as you like.
>
> On 12 November 2010 02:17, web service <wbsrvc@gmail.com> wrote:
> > Hi,
> >  Currently I run my sample hadoop job from a bash script using the
> > following command ...
> >
> > [code]
> > tmp="$HADOOP_BIN jar $JAR_LOC  $MAIN_CLASS /user/joe/input/input-$i/
> > /user/vadmin/output/output-$i/
> > $tmp
> > [/code]
> >
> > However, I would want to write a timer that would do some cleanup after
> the
> > jobs are  complete and restart the jobs after x hours. What I am looking
> for
> > is
> > the ability to invoke job from within a program and not the jar command
> > thing.
> >
> > -Mac
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message