airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saminda Wijeratne <samin...@gmail.com>
Subject Re: Implementing the cancel/terminate
Date Mon, 21 Apr 2014 19:23:12 GMT
On Mon, Apr 21, 2014 at 11:51 AM, Eroma Abeysinghe <
eroma.abeysinghe@gmail.com> wrote:

> Hi,
>
> If you have bottom up we will not be able to cancel unless there is a job
> available for that experiment right?
>
Hmm... dats true... when a Job is not available the Task would be the
bottom doing the preprocessing stuff for the job submission or post
processing stuff after job completion.

> Also few questions;
> 1. What do we really mean by canceling? is it just a status update?
>
Effectively a status update. But it will signal for interested parties what
is happenning. eg: gateway will knwo that its in the process of canceling
and GFacProvider knows it needs to perform actual cancellation if not
already done.

> OR
> 2. Are we going to stop all file transfers, delete any data file/file path
> existing in Airavata for that experiment/tasks/job?
>
I think thats what Raminder meant by the cleanup operations.

>  3. And also are we considering both single submission and workflows or
> is it just single submissions?
>
Personally right now I'm focusing  only on single submission. Atleast until
I get my bearings on how this new design will play-out with what we want.
Workflows will come next.

> If we are going to consider canceling of workflows then we need to extend
> cancelling to multiple tasks and jobs an experiment would have
>
Yep

> 4. Also we need to define what experiments we can cancel - IMO we don't
> need to bother with COMPLETED, CANCELED, UNKNOWN, FAILED experiments and
> similar statuses in tasks and jobs
>
Yes, I'll add that validation.


>
> IMHO i also don't think Job monitor should do any cancellations.
>
> Thank You,
> Best Regards,
> Eroma
>
>
>
>
>
> On Mon, Apr 21, 2014 at 2:37 PM, Saminda Wijeratne <samindaw@gmail.com>wrote:
>
>> May I finish setting up the framework for catching cancel requests? I'll
>> finish implementing the cancel once we decide upon who will do what when
>> cancelling a job.
>>
>> I just remembered the canceled notification would be handled by the
>> status update mechanism we introduced last week. But this mechanism works
>> only bottom up, i.e. Job status updates will trigger Task status updates
>> and that will trigger Experiment status updates. Does it make sense to have
>> "canceling" status also to progress likewise instead of top down (which i
>> suggested in the first mail)?
>>
>>
>>
>> On Mon, Apr 21, 2014 at 11:11 AM, Lahiru Gunathilake <glahiru@gmail.com>wrote:
>>
>>> Hi Raman,
>>>
>>>
>>> On Mon, Apr 21, 2014 at 10:35 AM, Raminder Singh <
>>> raminderjsingh@gmail.com> wrote:
>>>
>>>>  Thanks for investigation the problem and working through
>>>> solution. This is really required for the production gateways like
>>>> Ultrascan.
>>>>
>>>> In the current architecture where we have job submission(provider) and
>>>> monitoring separate, job cancel request need not to go to GFAC provider.
>>>> Provider submits the jobs and handover the job id to the orchestrator.
>>>> Orchestrator works with the job monitoring to maintain the job state. Now
>>>> the cancel need to be handled by Orchestrator and Monitoring. That will
>>>> change the course of action for API to cancel a job.
>>>>
>>>> I dont' think so, Orchestrator can invoke GFac Provider level job
>>> cancellation and it should simply reflect in the monitor when it try to get
>>> the status of that job( once its got to know by the monitor it should stop
>>> monitoring that job) and without modifying the monitor everything should
>>> work. There is no need to touch the monitor.
>>>
>>> I think Job cancellation should be a functionality of GFAC Provider and
>>> it should be similar to job submission where you can do pre processing and
>>> post processing after job cancellation operation.
>>>
>>>> One important requirement to take care is cleanup task after the job is
>>>> canceled like updating the job status table and updating the status.
>>>>
>>>> Thanks
>>>> Raminder
>>>>
>>>>
>>>> On Apr 21, 2014, at 9:06 AM, Saminda Wijeratne <samindaw@gmail.com>
>>>> wrote:
>>>>
>>>>  Hi All,
>>>>
>>>> After looking at the current design and doing some trial and error I
>>>> thought of implementing the cancellation as follows.
>>>>
>>>>
>>>>    - Cancellation of an experiment requested by a gateway requires
>>>>    cancellation request to go through several layers. (Orchestrator >
GFac >
>>>>    GFac Provider)
>>>>    - Each layer is responsible for handling cancellation relevant for
>>>>    that layer (Orchestration cancels experiment, GFac cancels Task, GFac
>>>>    Provider cancels Job)
>>>>    - What I thought is, each layer will listen to cancellation request
>>>>    made to the layer above and perform its cancellation actions accordingly.
>>>>    (GFac will see the experiment is having the status "canceling" for an
>>>>    experiment id and it will perform cancellation of the tasks relevant for
>>>>    that experiment)
>>>>       - Effectively the Orchestrator will be
>>>>       - updating the status of the experiment in registry with the
>>>>          status "canceling"
>>>>          - publish a message which will be caught by GFac instance
>>>>          which handles its Tasks.
>>>>          - GFac will perform the same and the correct GFac Provider
>>>>       instance will catch the message and perform the actual job cancellation.
>>>>    - Once the job cancellation is done the statuses at each layer will
>>>>    be updated (to "canceled") in  similar fashion.
>>>>    - We allow the API call of cancellation to be asynchronous
>>>>    - I'm hoping to use the MonitorPublisher implemented by Lahiru to
>>>>    publish the messages.
>>>>
>>>> wdyt?
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Saminda
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> System Analyst Programmer
>>> PTI Lab
>>> Indiana University
>>>
>>
>>
>
>
> --
> Thank You,
> Best Regards,
>  Eroma
>

Mime
View raw message