airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amila Jayasekara <thejaka.am...@gmail.com>
Subject Re: Monitor jobs by JobName if JobId is null
Date Mon, 27 Apr 2015 02:05:45 GMT
On Sun, Apr 26, 2015 at 10:00 PM, Lahiru Ginnaliya Gamathige <
glahiru@gmail.com> wrote:

> I have seen this quite often, but this could be due to some network issue
> or an issue with JCraft or the way we use JCraft not really with the job
> manager not returning the jobId.
>

I too suspect it is something to do with JCraft or the way we handle
JCraft. (specifically if you get it "quite" often).


>
> We have no issues with this approach and I think its a good idea to start
> monitoring using the jobName(without waiting for jobId) incase we get some
> failure immediately after job is submitted to the resource. We have a
> handle to monitoring the job status.
>

Well, i dont quite understand how you monitor jobs without a handle (the
job id) to the submitted job when you dont have the job id.

Thanks
-Amila


>
> On Sun, Apr 26, 2015 at 12:35 PM, Amila Jayasekara <
> thejaka.amila@gmail.com> wrote:
>
>> Its quite unusual that resource manager doesn't return the job id. Can
>> you give the details of specific occurrence ?
>> If qsub or sbatch doesn't return job id how can a person identify which
>> job he/she submitted ? (Even without airavata)
>>
>> Thanks
>> -Thejaka
>>
>> On Fri, Apr 24, 2015 at 1:55 PM, Lahiru Ginnaliya Gamathige <
>> glahiru@gmail.com> wrote:
>>
>>> Hi Sudhakar,
>>>
>>> Of course if its an error we identify that from standard Error and make
>>> the job fail but we have seen qsub and sbatch doesn't return jobId but job
>>> is actually submitted and we can identify these by parsing the standard
>>> error. If standard error is good and we got nothing in jobId we proceed and
>>> do the monitoring (pull based monitoring) so thats why we are using jobName
>>> instead of jobId.
>>>
>>> Shameera, JobName is Airavata generated random value so it will be
>>> unique like jobId and only good thing with jobName is we already know it
>>> and we can avoid relying on the qsub or any other commands output to do the
>>> monitoring.
>>>
>>> Lahiru
>>>
>>> On Fri, Apr 24, 2015 at 1:47 PM, Pamidighantam, Sudhakar V <
>>> spamidig@illinois.edu> wrote:
>>>
>>>> If jobid is missing due to error job name will not work.
>>>>
>>>> Sudhakar.
>>>>
>>>> On Apr 24, 2015, at 12:32 PM, Shameera Rathnayaka <
>>>> shameerainfo@gmail.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> There is a chance that we will not get jobId after we submit a job to a
>>>> compute resoruce. In that case we can use JobName to monitor the Job using
>>>> email based monitor. I am going to do that fix on master, if there is no
>>>> jobId then we use JobName as the job monitor key.
>>>>
>>>> Thanks,
>>>> Shameera.
>>>>
>>>> --
>>>> Best Regards,
>>>> Shameera Rathnayaka.
>>>>
>>>> email: shameera AT apache.org , shameerainfo AT gmail.com
>>>> Blog : http://shameerarathnayaka.blogspot.com/
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Research Assistant
>>> Science Gateways Group
>>> Indiana University
>>>
>>
>>
>
>
> --
> Research Assistant
> Science Gateways Group
> Indiana University
>

Mime
View raw message