hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanel Zukan <san...@gmail.com>
Subject Re: Re: how to query JobTracker
Date Thu, 17 Jun 2010 16:07:15 GMT
JobClient is able to directly connect to job tracker address (see
JobTracker constructor
with InetSocketAddress parameter). After that, getAllJobs() will
return known jobs and
you will able to find your job id there.

I would go with similar solution (with proposed one): write some lock
with job id
and on second job start, fetch currently running jobs, find my id,
check if is running and
decide what to do next.

PS:
I'm not sure you will able to construct custom job id from client side ;)


On Thu, Jun 17, 2010 at 5:12 PM, Some Body <somebody@squareplanet.de> wrote:
> Thanks Sanel,
>
> Assuming my driver class would always use a "custom" job ID like
>    "MyCustomJob"    instead of    "job_<YYYYMMDDHHMM>_<nnnn>" e.g.
job_201006171232_0004,
> which is the default, how would I then query for the jobID?
>
> Seems like it might just be easier to have my driver class
> submit the job, write the jobid to a lock file (hdfs://myapp/myjob.lock),  and then
>  a. remove the lock file when the job finishes, or
>  b. if a new job is triggered before the first finished, read the jobid from the lock
file
>     kill the previous job, and start a new one
>
> Alan
>
>
> ----- original message --------
>
> Subject: Re: how to query JobTracker
> Sent: Thu, 17 Jun 2010
> From: Sanel Zukan<sanelz@gmail.com>
>
>> AFAIK, there is no such method (to get a job name from client side) :(
>>  (at least I wasn't able to find it). Via JobProfile can be
>> extracted job name via given id, but only JobTracker can access it (if
>> you try to instantiate it, you will start own job tracker).
>>
>> The only solution is to directly query things via job id, received
>> when job was started.
>>
>> On Thu, Jun 17, 2010 at 2:53 PM, Some Body <somebody@squareplanet.de>
>> wrote:
>> > Hi All,
>> >
>> > What are the steps to query the cluster for running jobs with a particular
>> JobName?
>> > My driver class always submits my job with a preset name.
>> >    Job job = new Job(config, "My Job Name");
>> >    ......
>> >    return job.waitForCompletion(true) ? 0 : 1;
>> >
>> > I want to setup a cron to trigger the job submission and I want to ensure
>> only 1 instance of my job is running.
>> > Surely I could do this via a shell wrapper, but I'd rather implement it in
>> my driver class.
>> > i.e. getAllJobs from the JobTracker, check for "My Job Name", and kill the
>> old job before submitting a new job.
>> >
>> > I'm using (cloudera's) hadoop 0.20.2+228
>> >
>> > Thanks,
>> > Alan
>> >
>>
>
> --- original message end ----
>
>

Mime
View raw message