airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeffery Kinnison <Jeffery.D.Kinniso...@nd.edu>
Subject Re: Planning for In-Situ Application and Resource Monitoring [GSoC Project]
Date Wed, 04 May 2016 22:41:00 GMT
The more I look into it, the more I like using RabbitMQ within the
SimStream program to communicate with Airavata server. This is what I have
so far for practical steps to take in implementing the project:

SimStream:

   - Refactor to use RabbitMQ queues instead of Tornado Web Server.
   - Define a config file to send with each job that contains information
   about how to contact Airavata server, which scripts to run to collect the
   simulation and resource data, any arguments to pass to the scripts. This
   will decouple data collection logic from SimStream and hopefully eliminate
   the need for long-running data collection scripts (i.e., one data point is
   collected per run of the collection script).


Within the Airavata API Server:

   - Extend the existing org.apache.airavata.model.job.JobModel to include
   information about contacting SimStream (queue name, valid data stream
   names,
   - Add classes RabbitMQJobDataPublisher and RabbitMQJobDataConsumer
   (analogous to
org.apache.airavata.messaging.core.impl.RabbitMQProcessLaunchPublisher
   and org.apache.airavata.messaging.core.impl.
   RabbitMQProcessLaunchConsumer)
   - Extend the API Server to listen for requests for job data (requires
   identification of which job, which data from the job, in addition to
   verification that the requester should be allowed to perform this operation)
   - Extend the API Server to send requested job data back to the gateway
   and user that issued the request.

Within Airavata's XBaya :

   - Create default services for the data collection and event
   monitoring/handling aspects of the project that can be added into the
   workflow composer.
   - Create a service that accepts custom data collection scripts to send
   along with the job.

Within the PGA:

   - Add blades and controllers that allow users to view requested data
   from a job.
   - Extend the experiment-related app functionality to allow users to
   retrieve data from a running job through the gateway.

I saw that there are some existing but empty or commented-out classes (
org.apache.airavata.model.util.ComputeResourceUtil,
org.apache.airavata.monitoring.Main) that suggest there was work toward
similar functionality as I am suggesting. Searching JIRA for these didn't
turn up any information, and I'm curious about why these plans were
abandoned, or if they were even related to my project.

I'd appreciate any comments on the above!

Best,

Jeff K.

On Wed, Apr 27, 2016 at 4:43 PM, Pierce, Marlon <marpierc@iu.edu> wrote:

> RabbitMQ has first class support for Python, so that should not be a
> problem.  Suresh already included the link.  Suresh covered most of the
> bases already, so I’ll just reiterate that Airavata’s use of AMQP/RabbitMQ
> and Thrift should make it programming language independent. You can see how
> well this holds up in reality.
>
> Marlon
>
> From: Jeffery Kinnison <Jeffery.D.Kinnison.1@nd.edu>
> Reply-To: "dev@airavata.apache.org" <dev@airavata.apache.org>
> Date: Wednesday, April 27, 2016 at 4:35 PM
> To: "dev@airavata.apache.org" <dev@airavata.apache.org>
> Subject: Re: Planning for In-Situ Application and Resource Monitoring
> [GSoC Project]
>
> Thanks Suresh,
>
> I was hoping that I could stick with Python for the meat of the project,
> not just because it's the language I'm most comfortable with, but also
> thanks to the fact that it's fairly ubiquitous on HPC systems.
>
> I'll take a look at either interfacing the POC with RabbitMQ or converting
> it entirely to their Python bindings. If anyone has any alternative
> suggestions, they would be much appeciated.
>
> Jeff K.
>
> On Wed, Apr 27, 2016 at 4:20 PM, Suresh Marru <smarru@apache.org> wrote:
>
>> Hi Jeff,
>>
>> On Apr 27, 2016, at 4:08 PM, Jeffery Kinnison <
>> Jeffery.D.Kinnison.1@nd.edu> wrote:
>>
>> Hi Dev Team,
>>
>> I'd like to develop a plan for implementing my GSoC project in
>> conjunction to getting my development environment up and running. This is
>> my first substantial experience with Open Source software development on
>> this scale, so thank you in advance for bearing with me.
>>
>>
>> You did great during proposal (hence you have a project), just continue
>> the same. At worse you will hear a lot of RTFM which is a common encounter
>> in open source. I will let you google for it.
>>
>> The full project proposal can be found at
>> https://cwiki.apache.org/confluence/display/AIRAVATA/GSoC+Proposal+-+In+Situ+Simulation+Analysis+Using+Airavata
>>
>> The idea is to allow Airavata users to look behind the curtain at jobs
>> they are running and enable automatic response to conditions encountered as
>> jobs run, both at the system and application level. This will likely
>> require a lightweight server to run alongside each job, which will
>> communicate with the Airavata server.
>>
>> I have a prototype for the lightweight server (
>> https://github.com/jeffkinnison/simstream) written in Python, however I
>> know that Apache software is typically Java-based. The question here is one
>> of whether or not the prototype can be rolled into Airavata, or if I need
>> to begin looking into Java-based solutions.
>>
>>
>> No, you do not need to port your simstream to Java, infact. Since your
>> application demeon will need to run on HPC compute nodes, Java will not be
>> a good fit there. I think you should stick to python. For the communication
>> with Airavata, one suggestion will be to send a AMQP message which Airavata
>> listens to. You can follow this tutorial as a start -
>> https://www.rabbitmq.com/tutorials/tutorial-one-python.html. Others may
>> have different suggestions.
>>
>> The other initial question I have is one of how the Airavata server
>> submits jobs. From what I can tell, Airavata sends batch scripts to
>> connected computing resources, and my thinking right now about how to
>> deploy the lightweight server is to add its startup logic to the submit
>> scripts. Is this the correct thinking?
>>
>>
>> Yes thats exactly right. As you might see from other discussions, the
>> cloud based submissions might not have a batch script, but its fair to
>> assume your server will be launched one way or another.
>>
>>
>> Again, thank you for answering these questions, and I'm looking forward
>> to working with everyone this summer.
>>
>>
>> Keep them coming.
>>
>> Suresh
>>
>>
>> Best,
>> Jeff K.
>>
>>
>>
>

Mime
View raw message