airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pierce, Marlon" <>
Subject Re: [GSoC Proposal] - Integrating Job and Cloud Health Information of Apache Aurora with Apache Airavata
Date Mon, 21 Mar 2016 20:52:21 GMT
Hi Gourav,

Please go ahead and submit a proposal draft through the GSOC 2016 web site. I personally recommend
using the google doc option over posting the drafts to the Airavata wiki since I can make
comments inline.



From: Gourav Rattihalli <<>>
Reply-To: "<>" <<>>
Date: Monday, March 21, 2016 at 10:22 AM
To: "<>" <<>>
Subject: [GSoC Proposal] - Integrating Job and Cloud Health Information of Apache Aurora with
Apache Airavata

Hi Dev Team,

Please review the following GSoC proposal that I plan to submit:

Title: Integrating Job and Cloud Health Information of Apache Aurora with Apache Airavata

This project will incorporate Apache Aurora to enable Airavata to launch jobs on large cloud
environments, and collect the related information on the health of each job and the cloud
resources. The project will also analyze the current micro-services architecture of Airavata
and develop code for an updated architecture for modules such as Logging. As as result, another
outcome of this project would be development of a module that will collect all the logging
information from the various execution points in an Airavata job's lifecycle and provide search
and mining capability.


Apache Aurora is a service scheduler, that runs on top of Apache Mesos. This combination enables
the use of long running services that take advantage of Apache Mesos scalability, fault-tolerance
and resource isolation. Apache Mesos is a cluster manager, which provides information about
the state of the cluster. Aurora uses that knowledge to make scheduling decisions. For example,
when a machine experiences failure Aurora automatically reschedules those previously-running
services onto a healthy machine in order to keep them running. Each job is tracked by Aurora
to be in one of the following states: pending, assigned, starting, running, and finished.

Apache Aurora requires a configuration file ”.aurora” to launch jobs. Following is an
example of Aurora configuration file:

import os
hello_world_process = Process(name = 'hello_world', cmdline = 'echo hello world')

hello_world_task = Task(
 resources = Resources(cpu = 0.1, ram = 16 * MB, disk = 16 * MB),
 processes = [hello_world_process])

hello_world_job = Job(
 cluster = 'cluster1',
 role = os.getenv('USER'),
 task = hello_world_task)

jobs = [hello_world_job]

To launch the job with the above configuration we use:

aurora job create cluster1/$USER/test/hello_world hello_world.aurora

This project will develop modules in Airavata to automatically generate the Aurora configuration
file to launch a job on an Aurora-managed cluster in a cloud environment. The Aurora user
interface, as shown in the web portal displayed above, provides detailed information on the
job status, job name, start and finish times, location of the logs, and resource usage. This
project will use add a module to Apache Aurora to pull this detailed information using the
the Aurora HTTP API.


  *   This project will investigate how apache Aurora collects information of cluster environment
for display on the Aurora web interface. We will study the Aurora HTTP API and retrieve all
the information related to the target infrastructure and job health, and make it available
to the Airavata job submission module.

  *   We will process the retrieved information from Aurora and convert the information in
a format that can be used by Airavata for further action.

  *   We will use the appropriate design patterns to integrate the use of Aurora as one of
the options for Big Data and Cloud resource frameworks with the Airavata framework

  *   We will make the resource information from Aurora available for display on the Airavata

Any comment and suggestions would be very helpful.

-Gourav Rattihalli
View raw message