airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amila Jayasekara <thejaka.am...@gmail.com>
Subject Re: Job cancellation in GFac
Date Tue, 16 Jul 2013 17:33:01 GMT
On Tue, Jul 16, 2013 at 1:11 PM, Lahiru Gunathilake <glahiru@gmail.com>wrote:

> Hi Amila,
>
> I think at this level we can live without having interpreter level job
> canceling, because if we cancel a job in some other thread interpreter can
> pick it up and make the that node as cancelled and with current interpreter
> logic, after the the first job failure workflow is failing. So logically
> before we think of interpreter level job canceling we need to do more work
> in our interpreter logic to make use of that feature.
>

I dont understand quite what you are saying above.
The GFac job cancellation is needed for 3 reasons in my opinion.

1. If a workflow is cancelled
2. If WF interpretter decide it should cancel execution of a node based on
a feedback loop (I know we dont have right now)
3. If a job needs to be cancelled when using GFac for job submission only

1 and 2 should originate from interpretter for sure. For consistency I
think 3 should also come from interpretter.


>
> For job canceling logic, we can pick the provider using the same logic we
> use in normal GFAC node execution and program against provider interface,
> so that right cancel method will get called.
>

Please look at the trunk code. This is already in place though the
implementation is restricted Gram at the moment.


>
> Raman, what do you mean by user setting JobExecutionContext or security
> Context ? User doesn't have to set anything, we create it and set it in to
> JobExecutionContext, the same way as we do in GFacAPI, user just have to
> specify the nodeId, experimentId.
>
> Thanks
> Lahiru
>
>
> On Tue, Jul 16, 2013 at 10:23 PM, Amila Jayasekara <
> thejaka.amila@gmail.com> wrote:
>
>>
>>
>>
>> On Tue, Jul 16, 2013 at 12:30 PM, Saminda Wijeratne <samindaw@gmail.com>wrote:
>>
>>>
>>>
>>>
>>> On Tue, Jul 16, 2013 at 12:18 PM, Raminder Singh <
>>> raminderjsingh@gmail.com> wrote:
>>>
>>>> Thanks Amila for providing the details. Job cancel will be user action
>>>> called from API or Xbaya.  I don't think its necessarily always a workflow
>>>> interpreter operation.  Its will be useful if we provide an option in API
>>>> to cancel jobs. I have few other questions
>>>>
>>>> 1. We don't need to pass experimentid, workflowid, nodeid all the way
>>>> to gfac level. GFAC only need jobid to create cancel request for the job.
>>>> According to me getting of job id need to be done in API and only job id
>>>> need to be passed to this level.
>>>>
>>> +1
>>> At the workflow interpreter level it should be "cancel node execution"
>>> "cancel workflow execution" "cancel experiment". The interpreter can
>>> translate the node id to the gfac job id and call cancel job in the gfac
>>> interface.
>>>
>>
>> Ok. Lets have a single method with job id to cancel jobs.
>>
>>
>>>
>>> 2. I looked into GramProvider code and did not like the dependency of
>>>> JobExecutionContext in these methods.  I observed you are using it get
>>>> security context. Is not it lightweight for the client to just set security
>>>> context?
>>>>
>>>
>> I prefer to keep JobExecutionContext as it is the medium communicating
>> with the GFac interface. Further if we need pass any additional parameters
>> we can use job execution context.
>>
>> I assume Raman or Saminda will help implementing job cancellation at
>> interpretter level and also at API level.
>>
>> Thanks
>> Amila
>>
>>
>>>
>>>> Please let me know if you have any questions.
>>>>
>>>> Thanks
>>>> Raminder
>>>>
>>>> On Jul 16, 2013, at 11:11 AM, Amila Jayasekara <thejaka.amila@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi All,
>>>> >
>>>> > I have added following methods to GFacProvider interface to do job
>>>> cancellation. But we need to figure out from where these methods should be
>>>> called. As I feel these methods should get triggered from Workflow
>>>> Interpretter.
>>>> >
>>>> > I would like to use this mail thread to discuss how we can invoke
>>>> cancellation methods and how we can expose job cancellation at API.
>>>> >
>>>> > Please give feedback.
>>>> >
>>>> > Thanks
>>>> > Amila
>>>> >
>>>> >
>>>> > /**
>>>> >      * Cancels all jobs relevant to an experiment.
>>>> >      * @param experimentId The experiment id
>>>> >      * @param jobExecutionContext The job execution context, contains
>>>> runtime information.
>>>> >      * @throws GFacException If an error occurred while cancelling
>>>> the job.
>>>> >      */
>>>> >     void cancelJob(String experimentId, JobExecutionContext
>>>> jobExecutionContext) throws GFacException;
>>>> >
>>>> >     /**
>>>> >      * Cancels all jobs relevant to a workflow in an experiment.
>>>> >      * @param experimentId The experiment id
>>>> >      * @param workflowId The workflow id.
>>>> >      * @param jobExecutionContext The job execution context, contains
>>>> runtime information.
>>>> >      * @throws GFacException If an error occurred while cancelling
>>>> the job.
>>>> >      */
>>>> >     void cancelJob(String experimentId, String workflowId,
>>>> >                    JobExecutionContext jobExecutionContext) throws
>>>> GFacException;
>>>> >
>>>> >     /**
>>>> >      * Cancels the job for a given a workflow id and node id in an
>>>> experiment.
>>>> >      * @param experimentId The experiment id.
>>>> >      * @param workflowId The workflow id.
>>>> >      * @param nodeId The node id.
>>>> >      * @param jobExecutionContext The job execution context relevant
>>>> to cancel job operation.
>>>> >      * @throws GFacException If an error occurred while cancelling
>>>> the job.
>>>> >      */
>>>> >     void cancelJob(String experimentId, String workflowId, String
>>>> nodeId,
>>>> >                    JobExecutionContext jobExecutionContext) throws
>>>> GFacException;
>>>>
>>>>
>>>
>>
>
>
> --
> System Analyst Programmer
> PTI Lab
> Indiana University
>

Mime
View raw message