airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pierce, Marlon" <>
Subject Re: [GSOC Proposal] Cloud based clusters for Apache Airavata
Date Tue, 22 Mar 2016 18:28:59 GMT
Hi Mangirish, please add your proposal to the GSOC 2016 site.

From: Mangirish Wagle <<>>
Reply-To: "<>" <<>>
Date: Thursday, March 17, 2016 at 3:35 PM
To: "<>" <<>>
Subject: [GSOC Proposal] Cloud based clusters for Apache Airavata

Hello Dev Team,

I had the opportunity to interact with Suresh and Shameera wherein we discussed an open requirement
in Airavata to be addressed. The requirement is to expand the capabilities of Apache Airavata
to submit jobs to cloud based clusters in addition to HPC/ HTC clusters.

The idea is to dynamically provision a cloud cluster in an environment like Jetstream, based
on the configuration figured out by Airavata, which would be operated by a distributed system
management software like Mesos. An initial high level goals would be:-

  1.  Airavata categorizes certain jobs to be run on cloud based clusters and figure out the
required hardware config for the cluster.
  2.  The proposed service would provision the cluster with the required resources.
  3.  An ansible script would configure a Mesos cluster with the resources provisioned.
  4.  Airavata submits the job to the Mesos cluster.
  5.  Mesos then figures out the efficient resource allocation within the cluster and runs
the job and fetches the result.
  6.  The cluster is then deprovisioned automatically when not in use.

The project would mainly focus on point 2 and 6 above.

To start with, I am currently trying to get a working prototype of setting up compute nodes
on an openstack environment using JClouds (Targetted for Jetstream). Also, I am planning to
explore the option of using Openstack Heat engine to orchestrate the cluster. However, going
ahead Airavata would be supporting other clouds like Amazon EC2 or Comet cluster, so we need
to have a generic solution for achieving the goal.

Another approach which might be efficient in terms of performance and time is using a container
based clouds using Docker, Kubernetes which would have substantially less bootstrap time compared
to cloud VMs. This would be a future prospect as we may not have all the clusters supporting

This has been considered as a potential GSOC project and I would be working on drafting a
proposal on this idea.

Any inputs/ comments/ suggestions would be very helpful.

Best Regards,
Mangirish Wagle
View raw message