airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DImuthu Upeksha <dimuthu.upeks...@gmail.com>
Subject Linked Container Services for Apache Airavata Components - Phase 1 - Requirement identification
Date Thu, 05 Oct 2017 06:40:10 GMT
Hi All,

Within last few days, I have been going through the requirements and design
of current setup of Airavata and I identified following ares as the key
focusing areas in the technology evaluation phase

Micorservices deployment platform (container management system)

Possible candidates: Google Kubernetes, Apache Mesos, Apache Helix
As the most of the operational units of Airavata is supposed to be moving
into microservices based deployment pattern, having a unified deployment
platform to manage those microservices will make the DevOps operations
easier and faster. From the other hand, although writing and maintaining a
single micro service is a somewhat straightforward way, making multiple
microservies running, monitoring and maintaining the lifecycles manually in
a production environment is an tiresome and complex operation to perform.
Using such a deployment platform, we can easily automate lots of pain
points that I have mentioned earlier.

Scalability

We need a solution that can easily scalable depending on the load condition
of several parts of the system. For example, the workers in the post
processing pipeline should be able scaled up and down depending on the
events come into the message queue.

Availability

We need to support solution to be deployed in multiple geographically
distant data centers. When evaluating container management systems, we
should consider this is as a primary requirement. However one thing that I
am not sure is the availability mode that Airavata normally expect. Is it a
active-active mode or active-passive mode?

Service discovery

Once we move in to microservice based deployment pattern, there could be
scenarios where we want service discovery for several use cases. For
example, if we are going to scale up API Server to handle an increased
load, we might have to put a load balancer in between the client and API
Server instances. In that case, service discovery is essential to instruct
the load balancer with healthy API Server endpoints which are currently
running in the system.

Cluster coordination

Although micorservices are supposed to be stateless in most of the cases,
we might have scenarios to feed some state to particular micorservices. For
example if we are going to implement a microservice that perform
Orchestrator's role, there could be issues if we keep multiple instances of
it in several data centers to increase the availability. According to my
understanding, there should be only one Orchestrator being running at a
time as it is the one who takes decisions of the job execution process. So,
if we are going to keep multiple instances of it running in the system,
there should be an some sort of a leader election in between Orchestrator
quorum.

Common messaging medium in between mocroservices

This might be out of the scope but I thought of sharing with the team to
have an general idea. Idea was raised at the hip chat discussion with
Marlon and Gaourav. Using a common messaging medium might enable
microservices to communicate with in a decoupled manner which will increase
the scalability of the system. For example there is a reference
architecture that we can utilize with kafka based messaging medium [1],
[2]. However I noticed in one paper that Kafka was previously rejected as
writing clients was onerous. Please share your views on this as I'm not
familiar with the existing fan out model based on AMQP and  pain points of
it.

Those are the main areas that I have understood while going through
Airavata current implementation and requirements stated in some of the
research papers. Please let me know whether my understanding on above items
are correct and suggestions are always welcome :)

[1] https://medium.com/@ulymarins/an-introduction-to-apache-kafka-and-
microservices-communication-bf0a0966d63
[2] https://www.slideshare.net/ConfluentInc/microservices-in-the-apache-
kafka-ecosystem

References

Marru, S., Gunathilake, L., Herath, C., Tangchaisin, P., Pierce, M.,
Mattmann, C., Singh, R., Gunarathne, T., Chinthaka, E., Gardler, R. and
Slominski, A., 2011, November. Apache airavata: a framework for distributed
applications and computational workflows. In Proceedings of the 2011 ACM
workshop on Gateway computing environments (pp. 21-28). ACM.

Nakandala, S., Pamidighantam, S., Yodage, S., Doshi, N., Abeysinghe, E.,
Kankanamalage, C.P., Marru, S. and Pierce, M., 2016, July. Anatomy of the
SEAGrid Science Gateway. In Proceedings of the XSEDE16 Conference on
Diversity, Big Data, and Science at Scale (p. 40). ACM.

Pierce, Marlon E., Suresh Marru, Lahiru Gunathilake, Don Kushan Wijeratne,
Raminder Singh, Chathuri Wimalasena, Shameera Ratnayaka, and Sudhakar
Pamidighantam. "Apache Airavata: design and directions of a science gateway
framework." Concurrency and Computation: Practice and Experience 27, no. 16
(2015): 4282-4291.

Pierce, Marlon, Suresh Marru, Borries Demeler, Raminderjeet Singh, and Gary
Gorbet. "The apache airavata application programming interface: overview
and evaluation with the UltraScan science gateway." In Proceedings of the
9th Gateway Computing Environments Workshop, pp. 25-29. IEEE Press, 2014.

Marru, Suresh, Marlon Pierce, Sudhakar Pamidighantam, and Chathuri
Wimalasena. "Apache Airavata as a laboratory: architecture and case study
for component- based gateway middleware." In Proceedings of the 1st
Workshop on The Science of Cyberinfrastructure: Research, Experience,
Applications and Models, pp. 19-26. ACM, 2015.

Thanks
Dimuthu

Mime
View raw message