airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Supun Nakandala <supun.nakand...@gmail.com>
Subject Re: Linked Container Services for Apache Airavata Components - Phase 1 - Requirement identification
Date Mon, 09 Oct 2017 04:13:23 GMT
+1 for the idea.

On Sun, Oct 8, 2017 at 2:52 AM, DImuthu Upeksha <dimuthu.upeksha2@gmail.com>
wrote:

> Hi Supun,
>
> My belief also letting orchestrator to determine the worker to run
> particular job is complex to implement and will make the maintainability of
> orchestrator code quite hard in long run. I'm also in partially agreement
> with embedding a worker inside the firewall protected resource but I guess
> we can improve it further to make homogenous and stateless. Have a look
> at following figure
>
>
> In above design we keep all the workers outside and keep a daemon inside
> the protected resource to securely communicate with workers. Then the
> problem is how do we make the worker homogenous as this is still just
> adding another layer to the solution stated above. Trick is, we decouple
> the communication between worker and resource. Communication to any
> resource is being done through a well defined API. Speaking in java
>
> public interface CommunicationInterface {
>       public String sshToResource(String resourceIp, String command);
>       public void transferDataTo(String resourceIp, String target,
> InputStream in);
>       public void transferDataFrom(String resourceIp, String target,
> OutputStream out);
> }
> ​
> Implementation of this API might change according to the resource. We keep
> a separate Catalog that will cater the libraries that have the
> implementation specific to each resource. For example, if Worker 1 needs to
> talk to Resource 1 which acts behind a firewall and the Airavata
> communication agent is placed inside, it will query the Catalog for the
> Resource 1 and fetch the library that implemented CommunicationInterface
> to talk securely with Airavata Agent. If it wants to talk to Resource 2,
> another library will be fetched from Catalog that has default
> implementations. Once those SDKs are fetched, they are loaded into the JVM
> at runtime using a class loader and communication will be done afterwards.
>
> We can improve this by caching libraries inside workers and reusing them
> as much as possible to limit number of queries to Catalog from workers.
>
> Advantage of this is, we can add resources with different security levels
> without changing the Worker implementations. Only thing we have to do is to
> come up with an agent and a library to talk with agent. Then add them to
> Catalog and rest will be taken cared by the framework. This model is
> analogous to the sql drivers that we use in java to connect to databases.
>
> Please note that I came up with this design based on the limited knowledge
> I have in Airavata Workers and Resources. There will be lot of corner cases
> that I have not identified. Your views and ideas are highly appreciated.
>
> Thanks
> Dimuthu
>
> On Sun, Oct 8, 2017 at 10:51 AM, Supun Nakandala <
> supun.nakandala@gmail.com> wrote:
>
>> Hi Dimuthu,
>>
>> Thank you for the very good summary. I think you have covered almost all
>> the things.
>>
>> I would also like to mention one other futuristic requirements that I
>> think will be important in this discussion.
>>
>> In my opinion going forward, Airavata will get the requirement of working
>> with firewall protected resources. In such cases, workers which are
>> residing outside will not be able to communicate with the protected
>> resources. What we initially thought was to deploy a special type of worker
>> which will be placed inside the firewall-protected network and will
>> coordinate with Airavata orchestrator to execute actions. One such tool
>> which is used by ServiceNow in enterprise settings is the MidServer (
>> http://wiki.servicenow.com/index.php?title=MID_Server#gsc.tab=0). The
>> downside of this approach is that it breaks our assumption of all workers
>> being homogenous and therefore require orchestrator to be worker aware.
>> Perhaps, instead of workers picking work we can design such that
>> orchestrator will grant work to the corresponding work. But this
>> incorporates a lot of complexity on the orchestrator's side.
>>
>>
>>
>> On Oct 5, 2017 10:47 AM, "DImuthu Upeksha" <dimuthu.upeksha2@gmail.com>
>> wrote:
>>
>>> Hi Gaurav,
>>>
>>> Thanks a lot for the detailed description about DC/OS and how it can be
>>> utilized in Airavata. Seems like it is an interesting project and I'll add
>>> it to the technology list that are to be evaluated.
>>>
>>> When selecting a technology, in addition to the features it provides, we
>>> might have to take some non-functional features like the community
>>> participation (committers, commits and forks), number of customers  who
>>> are  running it  in production environments, maturity of the project and
>>> the complexity it brings in to the total system into the consideration. So
>>> I'll first try to go through the resources (documentation and source) and
>>> try to grab concepts of DC/OS and hopefully I can work with you to dig
>>> deeper to understand more about DC/OS
>>>
>>> Thanks
>>> Dimuthu
>>>
>>> On Thu, Oct 5, 2017 at 8:50 PM, Shenoy, Gourav Ganesh <
>>> goshenoy@indiana.edu> wrote:
>>>
>>>> Sorry, missed the attachment in my previous email.
>>>>
>>>>
>>>>
>>>> PS: DC/OS is just a recommendation for performing containerized
>>>> deployment and application management for Airavata. I would be happy to
>>>> consider alternative frameworks such as Kubernetes.
>>>>
>>>>
>>>>
>>>> Thanks and Regards,
>>>>
>>>> Gourav Shenoy
>>>>
>>>>
>>>>
>>>> *From: *"Shenoy, Gourav Ganesh" <goshenoy@indiana.edu>
>>>> *Reply-To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
>>>> *Date: *Thursday, October 5, 2017 at 11:16 AM
>>>>
>>>> *To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
>>>> *Subject: *Re: Linked Container Services for Apache Airavata
>>>> Components - Phase 1 - Requirement identification
>>>>
>>>>
>>>>
>>>> Hi Dimuthu,
>>>>
>>>>
>>>>
>>>> Very good summary! I am not sure if you have, but DC/OS (DataCenter
>>>> Operating System) is a container orchestration platform based on Apache
>>>> Mesos. The beauty of DC/OS is the ease and simplicity of
>>>> development/deployment; yet being extremely powerful in most of the
>>>> parameters – multi-datacenter, multi-cloud, scalability, high availability,
>>>> fault tolerance, load balancing, and more importantly the community support
>>>> is fantastic.
>>>>
>>>>
>>>>
>>>> DC/OS has an exhaustive service catalog, it’s more like a PAAS for
>>>> containers (not just restricted to containers though) – you can run
>>>> services like Spark, Kafka, RabbitMQ, etc out of the box with a single
>>>> click install. And Apache Mesos as the underlying resource manager makes
it
>>>> seamless to deploy applications across different datacenters. There is a
>>>> concept of SERVICE vs JOB – service is considered long running and DC/OS
>>>> will make sure it keeps it running (if a service fails, it spins up a new
>>>> one), whereas jobs are one time executors. This comes handy for using DC/OS
>>>> as a target runtime for Airavata.
>>>>
>>>>
>>>>
>>>> We used DC/OS for our class project to run the distributed task
>>>> execution prototype we built (which uses RabbitMQ messaging). Here’s a
link
>>>> to the blog I have explaining the process:
>>>> https://gouravshenoy.github.io/apache-airavata/spring17/2017
>>>> /04/20/final-report.html . I have also attached a PDF paper we wrote
>>>> as part of the class explaining the task execution process and *one
>>>> solution* using rabbitmq messaging.
>>>>
>>>>
>>>>
>>>> I had also started with the work of containerizing Airavata and a
>>>> unified build + deployment mechanism with CI CD on DC/OS. Unfortunately,
I
>>>> couldn’t complete it due to time constraints, but I would be more than
>>>> happy to work with you on this. Let me know and we can coordinate.
>>>>
>>>>
>>>>
>>>> Thanks and Regards,
>>>>
>>>> Gourav Shenoy
>>>>
>>>>
>>>>
>>>> *From: *DImuthu Upeksha <dimuthu.upeksha2@gmail.com>
>>>> *Reply-To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
>>>> *Date: *Thursday, October 5, 2017 at 9:52 AM
>>>> *To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
>>>> *Subject: *Re: Linked Container Services for Apache Airavata
>>>> Components - Phase 1 - Requirement identification
>>>>
>>>>
>>>>
>>>> Hi Marlon,
>>>>
>>>>
>>>>
>>>> Thanks for the input. I got your idea of availability mode and will
>>>> keep in mind while designing the PoC. CI/CD is the one I have missed and
>>>> thanks for pointing it out.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Dimuthu
>>>>
>>>>
>>>>
>>>> On Thu, Oct 5, 2017 at 7:04 PM, Pierce, Marlon <marpierc@iu.edu> wrote:
>>>>
>>>> Thanks, Dimuthu, this is a good summary. Others may comment about
>>>> Kafka, stateful versus stateless parts of Airavata, etc.  You may also find
>>>> some of this discussion on the mailing list archives.
>>>>
>>>>
>>>>
>>>> Active-active vs. active-passive is a good question, and we have
>>>> typically thought of this in terms of individual Airavata components rather
>>>> than the whole system.  Some components can be active-active (like a
>>>> stateless application manager), while others (like the orchestrator example
>>>> you give below) are stafefull and may be better as active-passive.
>>>>
>>>>
>>>>
>>>> There is also the issue of system updates and continuous deployments,
>>>> which could be added to your list.
>>>>
>>>>
>>>>
>>>> Marlon
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From: *"dimuthu.upeksha2@gmail.com" <dimuthu.upeksha2@gmail.com>
>>>> *Reply-To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
>>>> *Date: *Thursday, October 5, 2017 at 2:40 AM
>>>> *To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
>>>> *Subject: *Linked Container Services for Apache Airavata Components -
>>>> Phase 1 - Requirement identification
>>>>
>>>>
>>>>
>>>> Hi All,
>>>>
>>>>
>>>>
>>>> Within last few days, I have been going through the requirements and
>>>> design of current setup of Airavata and I identified following ares as the
>>>> key focusing areas in the technology evaluation phase
>>>>
>>>>
>>>>
>>>> Micorservices deployment platform (container management system)
>>>>
>>>>
>>>>
>>>> Possible candidates: Google Kubernetes, Apache Mesos, Apache Helix
>>>>
>>>> As the most of the operational units of Airavata is supposed to be
>>>> moving into microservices based deployment pattern, having a unified
>>>> deployment platform to manage those microservices will make the DevOps
>>>> operations easier and faster. From the other hand, although writing and
>>>> maintaining a single micro service is a somewhat straightforward way,
>>>> making multiple microservies running, monitoring and maintaining the
>>>> lifecycles manually in a production environment is an tiresome and complex
>>>> operation to perform. Using such a deployment platform, we can easily
>>>> automate lots of pain points that I have mentioned earlier.
>>>>
>>>>
>>>>
>>>> Scalability
>>>>
>>>>
>>>>
>>>> We need a solution that can easily scalable depending on the load
>>>> condition of several parts of the system. For example, the workers in the
>>>> post processing pipeline should be able scaled up and down depending on the
>>>> events come into the message queue.
>>>>
>>>>
>>>>
>>>> Availability
>>>>
>>>>
>>>>
>>>> We need to support solution to be deployed in multiple geographically
>>>> distant data centers. When evaluating container management systems, we
>>>> should consider this is as a primary requirement. However one thing that
I
>>>> am not sure is the availability mode that Airavata normally expect. Is it
a
>>>> active-active mode or active-passive mode?
>>>>
>>>>
>>>>
>>>> Service discovery
>>>>
>>>>
>>>>
>>>> Once we move in to microservice based deployment pattern, there could
>>>> be scenarios where we want service discovery for several use cases. For
>>>> example, if we are going to scale up API Server to handle an increased
>>>> load, we might have to put a load balancer in between the client and API
>>>> Server instances. In that case, service discovery is essential to instruct
>>>> the load balancer with healthy API Server endpoints which are currently
>>>> running in the system.
>>>>
>>>>
>>>>
>>>> Cluster coordination
>>>>
>>>>
>>>>
>>>> Although micorservices are supposed to be stateless in most of the
>>>> cases, we might have scenarios to feed some state to particular
>>>> micorservices. For example if we are going to implement a microservice that
>>>> perform Orchestrator's role, there could be issues if we keep multiple
>>>> instances of it in several data centers to increase the availability.
>>>> According to my understanding, there should be only one Orchestrator being
>>>> running at a time as it is the one who takes decisions of the job execution
>>>> process. So, if we are going to keep multiple instances of it running in
>>>> the system, there should be an some sort of a leader election in between
>>>> Orchestrator quorum.
>>>>
>>>>
>>>>
>>>> Common messaging medium in between mocroservices
>>>>
>>>>
>>>>
>>>> This might be out of the scope but I thought of sharing with the team
>>>> to have an general idea. Idea was raised at the hip chat discussion with
>>>> Marlon and Gaourav. Using a common messaging medium might enable
>>>> microservices to communicate with in a decoupled manner which will increase
>>>> the scalability of the system. For example there is a reference
>>>> architecture that we can utilize with kafka based messaging medium [1],
>>>> [2]. However I noticed in one paper that Kafka was previously rejected as
>>>> writing clients was onerous. Please share your views on this as I'm not
>>>> familiar with the existing fan out model based on AMQP and  pain points of
>>>> it.
>>>>
>>>>
>>>>
>>>> Those are the main areas that I have understood while going through
>>>> Airavata current implementation and requirements stated in some of the
>>>> research papers. Please let me know whether my understanding on above items
>>>> are correct and suggestions are always welcome :)
>>>>
>>>>
>>>>
>>>> [1] https://medium.com/@ulymarins/an-introduction-to-apache-
>>>> kafka-and-microservices-communication-bf0a0966d63
>>>>
>>>> [2] https://www.slideshare.net/ConfluentInc/microservices-in
>>>> -the-apache-kafka-ecosystem
>>>>
>>>>
>>>>
>>>> References
>>>>
>>>>
>>>>
>>>> Marru, S., Gunathilake, L., Herath, C., Tangchaisin, P., Pierce, M.,
>>>> Mattmann, C., Singh, R., Gunarathne, T., Chinthaka, E., Gardler, R. and
>>>> Slominski, A., 2011, November. Apache airavata: a framework for distributed
>>>> applications and computational workflows. In Proceedings of the 2011 ACM
>>>> workshop on Gateway computing environments (pp. 21-28). ACM.
>>>>
>>>>
>>>>
>>>> Nakandala, S., Pamidighantam, S., Yodage, S., Doshi, N., Abeysinghe,
>>>> E., Kankanamalage, C.P., Marru, S. and Pierce, M., 2016, July. Anatomy of
>>>> the SEAGrid Science Gateway. In Proceedings of the XSEDE16 Conference on
>>>> Diversity, Big Data, and Science at Scale (p. 40). ACM.
>>>>
>>>>
>>>>
>>>> Pierce, Marlon E., Suresh Marru, Lahiru Gunathilake, Don Kushan
>>>> Wijeratne, Raminder Singh, Chathuri Wimalasena, Shameera Ratnayaka, and
>>>> Sudhakar Pamidighantam. "Apache Airavata: design and directions of a
>>>> science gateway framework." Concurrency and Computation: Practice and
>>>> Experience 27, no. 16 (2015): 4282-4291.
>>>>
>>>>
>>>>
>>>> Pierce, Marlon, Suresh Marru, Borries Demeler, Raminderjeet Singh, and
>>>> Gary Gorbet. "The apache airavata application programming interface:
>>>> overview and evaluation with the UltraScan science gateway." In Proceedings
>>>> of the 9th Gateway Computing Environments Workshop, pp. 25-29. IEEE Press,
>>>> 2014.
>>>>
>>>>
>>>>
>>>> Marru, Suresh, Marlon Pierce, Sudhakar Pamidighantam, and Chathuri
>>>> Wimalasena. "Apache Airavata as a laboratory: architecture and case study
>>>> for component- based gateway middleware." In Proceedings of the 1st
>>>> Workshop on The Science of Cyberinfrastructure: Research, Experience,
>>>> Applications and Models, pp. 19-26. ACM, 2015.
>>>>
>>>>
>>>>
>>>> Thanks
>>>>
>>>> Dimuthu
>>>>
>>>>
>>>>
>>>
>>>
>

Mime
View raw message