airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lahiru Gunathilake <>
Subject Re: Airavata Orchestrator component
Date Fri, 06 Dec 2013 21:58:11 GMT
Hi All,

I have added a google doc[1] with anyone to comment.


On Thu, Dec 5, 2013 at 2:34 PM, Lahiru Gunathilake <>wrote:

> Hi All,
> We are thinking of implementing an Airavata Orchestrator component to
> replace WorkflowInterpreter to avoid gateway developers to dealing with
> workflows when they simply have one single independent jobs to run in their
> gateways. This component is mainly focusing on how to invoke GFAC and
> accept requests from the client API.
> I have following features in mind about this component.
> 1. It gives a web services or REST interface where we can implement a
> client to invoke it to submit jobs.
> 2. Accepts a job request and parse the input types and if input types are
> correct, this will create an Airavata experiment ID.
> 3. Orchestrtor then store the job information to registry against the
> generated experiment ID (All the other components identify the job using
> this experiment ID).
> 4. After that Orchestrator pull up all the descriptors related to this
> request and do some scheduling to decide where to run the job and submit
> the job to a GFAC node (Handling multiple GFAC nodes is going to be a
> future improvement in Orchestrator).
> If we are trying to do pull based job submission it might be a good idea
> to handle errors, if we store jobs to Registry and GFAC pull jobs and
> execute them Orchestrator component really doesn' t have to worry about the
> error handling.
> Because we can implement a logic to GFAC if a particular job is not
> updating its status fora g iven time it assume job is hanged or either GFAC
> node which handles that job is fauiled, so  GFAC pull that job (we
> definitely need a locking mechanism here, to avoid two instances are not
> going to  execute hanged job) and  start execute it. (If GFAC is handling a
> long running job still it has to update the job stutus frequently with the
> same status to make sure GFAC node is running).
> 5. GFAC creates its execution chain and store it back to registry with
> experiment ID, and GFAC updates its states using check pointing.
> 6. If we are not doing pull based submission,during a GFAC failure
> Orchestrator have to identify it and submit the active jobs from failure
> gfac node  to other nodes.  This might cause job duplication in case
> Orchestrator falls alarm about GFAC failure (so have to handle carefully).
> We have lot more to discus about the GFAC but I limit our discussion to
> Orchestrator component for now.
> WDYT about this design ?
> Lahiru
> --
> System Analyst Programmer
> PTI Lab
> Indiana University

System Analyst Programmer
Indiana University

View raw message