airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eroma Abeysinghe <eroma.abeysin...@gmail.com>
Subject Re: Implementing the cancel/terminate
Date Mon, 21 Apr 2014 18:51:55 GMT
Hi,

If you have bottom up we will not be able to cancel unless there is a job
available for that experiment right?
Also few questions;
1. What do we really mean by canceling? is it just a status update?
OR
2. Are we going to stop all file transfers, delete any data file/file path
existing in Airavata for that experiment/tasks/job?
3. And also are we considering both single submission and workflows or is
it just single submissions?
If we are going to consider canceling of workflows then we need to extend
cancelling to multiple tasks and jobs an experiment would have
4. Also we need to define what experiments we can cancel - IMO we don't
need to bother with COMPLETED, CANCELED, UNKNOWN, FAILED experiments and
similar statuses in tasks and jobs


IMHO i also don't think Job monitor should do any cancellations.

Thank You,
Best Regards,
Eroma





On Mon, Apr 21, 2014 at 2:37 PM, Saminda Wijeratne <samindaw@gmail.com>wrote:

> May I finish setting up the framework for catching cancel requests? I'll
> finish implementing the cancel once we decide upon who will do what when
> cancelling a job.
>
> I just remembered the canceled notification would be handled by the status
> update mechanism we introduced last week. But this mechanism works only
> bottom up, i.e. Job status updates will trigger Task status updates and
> that will trigger Experiment status updates. Does it make sense to have
> "canceling" status also to progress likewise instead of top down (which i
> suggested in the first mail)?
>
>
>
> On Mon, Apr 21, 2014 at 11:11 AM, Lahiru Gunathilake <glahiru@gmail.com>wrote:
>
>> Hi Raman,
>>
>>
>> On Mon, Apr 21, 2014 at 10:35 AM, Raminder Singh <
>> raminderjsingh@gmail.com> wrote:
>>
>>>  Thanks for investigation the problem and working through solution. This
>>> is really required for the production gateways like Ultrascan.
>>>
>>> In the current architecture where we have job submission(provider) and
>>> monitoring separate, job cancel request need not to go to GFAC provider.
>>> Provider submits the jobs and handover the job id to the orchestrator.
>>> Orchestrator works with the job monitoring to maintain the job state. Now
>>> the cancel need to be handled by Orchestrator and Monitoring. That will
>>> change the course of action for API to cancel a job.
>>>
>>> I dont' think so, Orchestrator can invoke GFac Provider level job
>> cancellation and it should simply reflect in the monitor when it try to get
>> the status of that job( once its got to know by the monitor it should stop
>> monitoring that job) and without modifying the monitor everything should
>> work. There is no need to touch the monitor.
>>
>> I think Job cancellation should be a functionality of GFAC Provider and
>> it should be similar to job submission where you can do pre processing and
>> post processing after job cancellation operation.
>>
>>> One important requirement to take care is cleanup task after the job is
>>> canceled like updating the job status table and updating the status.
>>>
>>> Thanks
>>> Raminder
>>>
>>>
>>> On Apr 21, 2014, at 9:06 AM, Saminda Wijeratne <samindaw@gmail.com>
>>> wrote:
>>>
>>>  Hi All,
>>>
>>> After looking at the current design and doing some trial and error I
>>> thought of implementing the cancellation as follows.
>>>
>>>
>>>    - Cancellation of an experiment requested by a gateway requires
>>>    cancellation request to go through several layers. (Orchestrator > GFac
>
>>>    GFac Provider)
>>>    - Each layer is responsible for handling cancellation relevant for
>>>    that layer (Orchestration cancels experiment, GFac cancels Task, GFac
>>>    Provider cancels Job)
>>>    - What I thought is, each layer will listen to cancellation request
>>>    made to the layer above and perform its cancellation actions accordingly.
>>>    (GFac will see the experiment is having the status "canceling" for an
>>>    experiment id and it will perform cancellation of the tasks relevant for
>>>    that experiment)
>>>       - Effectively the Orchestrator will be
>>>       - updating the status of the experiment in registry with the
>>>          status "canceling"
>>>          - publish a message which will be caught by GFac instance
>>>          which handles its Tasks.
>>>          - GFac will perform the same and the correct GFac Provider
>>>       instance will catch the message and perform the actual job cancellation.
>>>    - Once the job cancellation is done the statuses at each layer will
>>>    be updated (to "canceled") in  similar fashion.
>>>    - We allow the API call of cancellation to be asynchronous
>>>    - I'm hoping to use the MonitorPublisher implemented by Lahiru to
>>>    publish the messages.
>>>
>>> wdyt?
>>>
>>>
>>> Thanks,
>>>
>>> Saminda
>>>
>>>
>>>
>>>
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>
>


-- 
Thank You,
Best Regards,
Eroma

Mime
View raw message