airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lahiru Gunathilake <glah...@gmail.com>
Subject Re: Job cancellation in GFac
Date Tue, 16 Jul 2013 17:46:15 GMT
Hi Amila,

Please see my comments below.


On Tue, Jul 16, 2013 at 11:03 PM, Amila Jayasekara
<thejaka.amila@gmail.com>wrote:

>
>
>
> On Tue, Jul 16, 2013 at 1:11 PM, Lahiru Gunathilake <glahiru@gmail.com>wrote:
>
>> Hi Amila,
>>
>> I think at this level we can live without having interpreter level job
>> canceling, because if we cancel a job in some other thread interpreter can
>> pick it up and make the that node as cancelled and with current interpreter
>> logic, after the the first job failure workflow is failing. So logically
>> before we think of interpreter level job canceling we need to do more work
>> in our interpreter logic to make use of that feature.
>>
>
> I dont understand quite what you are saying above.
> The GFac job cancellation is needed for 3 reasons in my opinion.
>
> 1. If a workflow is cancelled
>
We don't support this in workflow interpreter.

> 2. If WF interpretter decide it should cancel execution of a node based on
> a feedback loop (I know we dont have right now)
> 3. If a job needs to be cancelled when using GFac for job submission only
>

This is exactly I am telling. At this point we don't support 1,2 and we
need to implement them before we focus on interpreter level canceling.

>
> 1 and 2 should originate from interpretter for sure. For consistency I
> think 3 should also come from interpretter.
>

I think we can implement simple canceling first and fix interpreter to
support 1,2 features. Yes we can keep this in interpreter level but it
could be another method in interpreter just to cancel the given job,
without interrupting the workflow execution thread directly (it will get
interrupt once we cancel the job from another thread).

Lahiru

>
>
>>
>> For job canceling logic, we can pick the provider using the same logic we
>> use in normal GFAC node execution and program against provider interface,
>> so that right cancel method will get called.
>>
>
> Please look at the trunk code. This is already in place though the
> implementation is restricted Gram at the moment.
>
>
>>
>> Raman, what do you mean by user setting JobExecutionContext or security
>> Context ? User doesn't have to set anything, we create it and set it in to
>> JobExecutionContext, the same way as we do in GFacAPI, user just have to
>> specify the nodeId, experimentId.
>>
>> Thanks
>> Lahiru
>>
>>
>> On Tue, Jul 16, 2013 at 10:23 PM, Amila Jayasekara <
>> thejaka.amila@gmail.com> wrote:
>>
>>>
>>>
>>>
>>> On Tue, Jul 16, 2013 at 12:30 PM, Saminda Wijeratne <samindaw@gmail.com>wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Tue, Jul 16, 2013 at 12:18 PM, Raminder Singh <
>>>> raminderjsingh@gmail.com> wrote:
>>>>
>>>>> Thanks Amila for providing the details. Job cancel will be user action
>>>>> called from API or Xbaya.  I don't think its necessarily always a workflow
>>>>> interpreter operation.  Its will be useful if we provide an option in
API
>>>>> to cancel jobs. I have few other questions
>>>>>
>>>>> 1. We don't need to pass experimentid, workflowid, nodeid all the way
>>>>> to gfac level. GFAC only need jobid to create cancel request for the
job.
>>>>> According to me getting of job id need to be done in API and only job
id
>>>>> need to be passed to this level.
>>>>>
>>>> +1
>>>> At the workflow interpreter level it should be "cancel node execution"
>>>> "cancel workflow execution" "cancel experiment". The interpreter can
>>>> translate the node id to the gfac job id and call cancel job in the gfac
>>>> interface.
>>>>
>>>
>>> Ok. Lets have a single method with job id to cancel jobs.
>>>
>>>
>>>>
>>>> 2. I looked into GramProvider code and did not like the dependency of
>>>>> JobExecutionContext in these methods.  I observed you are using it get
>>>>> security context. Is not it lightweight for the client to just set security
>>>>> context?
>>>>>
>>>>
>>> I prefer to keep JobExecutionContext as it is the medium communicating
>>> with the GFac interface. Further if we need pass any additional parameters
>>> we can use job execution context.
>>>
>>> I assume Raman or Saminda will help implementing job cancellation at
>>> interpretter level and also at API level.
>>>
>>> Thanks
>>> Amila
>>>
>>>
>>>>
>>>>> Please let me know if you have any questions.
>>>>>
>>>>> Thanks
>>>>> Raminder
>>>>>
>>>>> On Jul 16, 2013, at 11:11 AM, Amila Jayasekara <
>>>>> thejaka.amila@gmail.com> wrote:
>>>>>
>>>>> > Hi All,
>>>>> >
>>>>> > I have added following methods to GFacProvider interface to do job
>>>>> cancellation. But we need to figure out from where these methods should
be
>>>>> called. As I feel these methods should get triggered from Workflow
>>>>> Interpretter.
>>>>> >
>>>>> > I would like to use this mail thread to discuss how we can invoke
>>>>> cancellation methods and how we can expose job cancellation at API.
>>>>> >
>>>>> > Please give feedback.
>>>>> >
>>>>> > Thanks
>>>>> > Amila
>>>>> >
>>>>> >
>>>>> > /**
>>>>> >      * Cancels all jobs relevant to an experiment.
>>>>> >      * @param experimentId The experiment id
>>>>> >      * @param jobExecutionContext The job execution context,
>>>>> contains runtime information.
>>>>> >      * @throws GFacException If an error occurred while cancelling
>>>>> the job.
>>>>> >      */
>>>>> >     void cancelJob(String experimentId, JobExecutionContext
>>>>> jobExecutionContext) throws GFacException;
>>>>> >
>>>>> >     /**
>>>>> >      * Cancels all jobs relevant to a workflow in an experiment.
>>>>> >      * @param experimentId The experiment id
>>>>> >      * @param workflowId The workflow id.
>>>>> >      * @param jobExecutionContext The job execution context,
>>>>> contains runtime information.
>>>>> >      * @throws GFacException If an error occurred while cancelling
>>>>> the job.
>>>>> >      */
>>>>> >     void cancelJob(String experimentId, String workflowId,
>>>>> >                    JobExecutionContext jobExecutionContext) throws
>>>>> GFacException;
>>>>> >
>>>>> >     /**
>>>>> >      * Cancels the job for a given a workflow id and node id in
an
>>>>> experiment.
>>>>> >      * @param experimentId The experiment id.
>>>>> >      * @param workflowId The workflow id.
>>>>> >      * @param nodeId The node id.
>>>>> >      * @param jobExecutionContext The job execution context relevant
>>>>> to cancel job operation.
>>>>> >      * @throws GFacException If an error occurred while cancelling
>>>>> the job.
>>>>> >      */
>>>>> >     void cancelJob(String experimentId, String workflowId, String
>>>>> nodeId,
>>>>> >                    JobExecutionContext jobExecutionContext) throws
>>>>> GFacException;
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> System Analyst Programmer
>> PTI Lab
>> Indiana University
>>
>
>


-- 
System Analyst Programmer
PTI Lab
Indiana University

Mime
View raw message