airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pierce, Marlon" <marpi...@iu.edu>
Subject Re: Planning for In-Situ Application and Resource Monitoring [GSoC Project]
Date Wed, 04 May 2016 22:47:06 GMT
+1 for publishing to RabbitMQ. Don’t worry about XBaya as it is obsolete; upgrading it is
another GFAC project. I suggest you focus on the SimStream to RabbitMQ parts first. The API
changes will need additional discussion, probably over a hangout.

Marlon


From: Jeffery Kinnison <Jeffery.D.Kinnison.1@nd.edu<mailto:Jeffery.D.Kinnison.1@nd.edu>>
Reply-To: "dev@airavata.apache.org<mailto:dev@airavata.apache.org>" <dev@airavata.apache.org<mailto:dev@airavata.apache.org>>
Date: Wednesday, May 4, 2016 at 6:41 PM
To: "dev@airavata.apache.org<mailto:dev@airavata.apache.org>" <dev@airavata.apache.org<mailto:dev@airavata.apache.org>>
Subject: Re: Planning for In-Situ Application and Resource Monitoring [GSoC Project]

The more I look into it, the more I like using RabbitMQ within the SimStream program to communicate
with Airavata server. This is what I have so far for practical steps to take in implementing
the project:

SimStream:

  *   Refactor to use RabbitMQ queues instead of Tornado Web Server.
  *   Define a config file to send with each job that contains information about how to contact
Airavata server, which scripts to run to collect the simulation and resource data, any arguments
to pass to the scripts. This will decouple data collection logic from SimStream and hopefully
eliminate the need for long-running data collection scripts (i.e., one data point is collected
per run of the collection script).

Within the Airavata API Server:

  *   Extend the existing org.apache.airavata.model.job.JobModel to include information about
contacting SimStream (queue name, valid data stream names,
  *   Add classes RabbitMQJobDataPublisher and RabbitMQJobDataConsumer (analogous to org.apache.airavata.messaging.core.impl.RabbitMQProcessLaunchPublisher
and org.apache.airavata.messaging.core.impl.RabbitMQProcessLaunchConsumer)
  *   Extend the API Server to listen for requests for job data (requires identification of
which job, which data from the job, in addition to verification that the requester should
be allowed to perform this operation)
  *   Extend the API Server to send requested job data back to the gateway and user that issued
the request.

Within Airavata's XBaya :

  *   Create default services for the data collection and event monitoring/handling aspects
of the project that can be added into the workflow composer.
  *   Create a service that accepts custom data collection scripts to send along with the
job.

Within the PGA:

  *   Add blades and controllers that allow users to view requested data from a job.
  *   Extend the experiment-related app functionality to allow users to retrieve data from
a running job through the gateway.

I saw that there are some existing but empty or commented-out classes (org.apache.airavata.model.util.ComputeResourceUtil,
org.apache.airavata.monitoring.Main) that suggest there was work toward similar functionality
as I am suggesting. Searching JIRA for these didn't turn up any information, and I'm curious
about why these plans were abandoned, or if they were even related to my project.

I'd appreciate any comments on the above!

Best,

Jeff K.

On Wed, Apr 27, 2016 at 4:43 PM, Pierce, Marlon <marpierc@iu.edu<mailto:marpierc@iu.edu>>
wrote:
RabbitMQ has first class support for Python, so that should not be a problem.  Suresh already
included the link.  Suresh covered most of the bases already, so I’ll just reiterate that
Airavata’s use of AMQP/RabbitMQ and Thrift should make it programming language independent.
You can see how well this holds up in reality.

Marlon

From: Jeffery Kinnison <Jeffery.D.Kinnison.1@nd.edu<mailto:Jeffery.D.Kinnison.1@nd.edu>>
Reply-To: "dev@airavata.apache.org<mailto:dev@airavata.apache.org>" <dev@airavata.apache.org<mailto:dev@airavata.apache.org>>
Date: Wednesday, April 27, 2016 at 4:35 PM
To: "dev@airavata.apache.org<mailto:dev@airavata.apache.org>" <dev@airavata.apache.org<mailto:dev@airavata.apache.org>>
Subject: Re: Planning for In-Situ Application and Resource Monitoring [GSoC Project]

Thanks Suresh,

I was hoping that I could stick with Python for the meat of the project, not just because
it's the language I'm most comfortable with, but also thanks to the fact that it's fairly
ubiquitous on HPC systems.

I'll take a look at either interfacing the POC with RabbitMQ or converting it entirely to
their Python bindings. If anyone has any alternative suggestions, they would be much appeciated.

Jeff K.

On Wed, Apr 27, 2016 at 4:20 PM, Suresh Marru <smarru@apache.org<mailto:smarru@apache.org>>
wrote:
Hi Jeff,

On Apr 27, 2016, at 4:08 PM, Jeffery Kinnison <Jeffery.D.Kinnison.1@nd.edu<mailto:Jeffery.D.Kinnison.1@nd.edu>>
wrote:

Hi Dev Team,

I'd like to develop a plan for implementing my GSoC project in conjunction to getting my development
environment up and running. This is my first substantial experience with Open Source software
development on this scale, so thank you in advance for bearing with me.

You did great during proposal (hence you have a project), just continue the same. At worse
you will hear a lot of RTFM which is a common encounter in open source. I will let you google
for it.

The full project proposal can be found at https://cwiki.apache.org/confluence/display/AIRAVATA/GSoC+Proposal+-+In+Situ+Simulation+Analysis+Using+Airavata

The idea is to allow Airavata users to look behind the curtain at jobs they are running and
enable automatic response to conditions encountered as jobs run, both at the system and application
level. This will likely require a lightweight server to run alongside each job, which will
communicate with the Airavata server.

I have a prototype for the lightweight server (https://github.com/jeffkinnison/simstream)
written in Python, however I know that Apache software is typically Java-based. The question
here is one of whether or not the prototype can be rolled into Airavata, or if I need to begin
looking into Java-based solutions.

No, you do not need to port your simstream to Java, infact. Since your application demeon
will need to run on HPC compute nodes, Java will not be a good fit there. I think you should
stick to python. For the communication with Airavata, one suggestion will be to send a AMQP
message which Airavata listens to. You can follow this tutorial as a start - https://www.rabbitmq.com/tutorials/tutorial-one-python.html.
Others may have different suggestions.

The other initial question I have is one of how the Airavata server submits jobs. From what
I can tell, Airavata sends batch scripts to connected computing resources, and my thinking
right now about how to deploy the lightweight server is to add its startup logic to the submit
scripts. Is this the correct thinking?

Yes thats exactly right. As you might see from other discussions, the cloud based submissions
might not have a batch script, but its fair to assume your server will be launched one way
or another.


Again, thank you for answering these questions, and I'm looking forward to working with everyone
this summer.

Keep them coming.

Suresh


Best,
Jeff K.



Mime
View raw message