airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Supun Nakandala <>
Subject Re: Evaluating Helix as the task execution framework
Date Tue, 21 Nov 2017 04:47:21 GMT
Hi Dimuthu,

I think this is great progress.

I would like to know whether you have already considered about the failure
scenarios too?

Specifically, I am curious about how you handle
 1. Task fails. This can happen due to several reasons. 1. invalid
configurations(e.g. SSH keys) and inputs 2. downstream system issues such
as network and file system failures 3. task fails due to other reasons(OOM,
OS kills, etc..)
 2. What will happen if a Helix participant fails?
 3. What will happen if one of the scheduler or event sink fails while
processing a request?

Also, it's not clear to me where will the workflow execution logic will
run. I see that the feedback loop of events is going back to the API server
and not to the schedulers. Shouldn't the scheduler get this information
back (for example if a task fails it will have to resubmit)?

Thank you.

On Mon, Nov 20, 2017 at 1:06 AM, DImuthu Upeksha <
> wrote:

> Hi Team,
> Depending on the feedback received in the last meeting with Suresh and the
> team, I worked on trying out Helix's task framework as the task execution
> framework in Airavata. After going through the documentation and existing
> work of Gaurav, I came up with a design [1] that we can deploy Helix's task
> framework as the task execution engine of Airavata in a containerized
> environment.
> ​
> In addition to that, I created a graphical workflow composer that can
> develop and deploy workflows into the engine. This was designed as
> explained in [2].
> ​
> You can have a look at the screen cast that demonstrates an end to end
> workflow execution from this [3] video.
> *Summary*
> 1. Although Helix is not the ideal candidate for container orchestration,
> I found Helix is well fitted for workflow execution in Airavata instead of
> building a task execution framework from the beginning.
> 2. All the Helix agents (controllers and participants) can be wrapped as
> the Docker containers and orchestrate on Kubernetes or DC/OS for high
> availability and scalability.
> 3. With the APIs of Helix, we can easily create and stop workflows at any
> time and the burden of managing those states is taken cared by Helix. This
> makes our life easy and let us focus more on improving and stabalizing the
> framework around that.
> *Note*
> Currently above demonstration is run inside the IDE and technically it is
> possible to dockerize each component and deploy into a Kubernetes cluster.
> I will perform the final packaging once the design is evaluated by the
> team. You can find the code up to that point from here [4]
> Please share your views and suggestions.
> [1]
> kVf0GeYQ0NZpI_Q/edit?usp=sharing​
> [2]
> ohIHmv08pMhCPM/edit?usp=sharing
> [3]
> [4]
> kubernetes
> Thanks
> Dimuthu

View raw message