airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Supun Nakandala <supun.nakand...@gmail.com>
Subject Re: [#Spring17-Airavata-Courses] : Distributed Workload Management for Airavata
Date Mon, 06 Feb 2017 17:15:05 GMT
Hi Gourav,

It is my belief that we don't need a separate microservice to each task. I
favor a single micro service which can execute all tasks (or in other words
a generic task execution micro service). Of course, we can have many of
them when we want to scale. WDYT?

On Sun, Feb 5, 2017 at 3:07 PM, Shenoy, Gourav Ganesh <goshenoy@indiana.edu>
wrote:

> Hi dev,
>
>
>
> We were brainstorming some potential designs that might help us with this
> problem. One possible option would be to have a “workflow micro-service”
> which would basically be the mediator/orchestrator for deciding which
> micro-service should be executed next – based on the type of the job. The
> motive is to make micro-services independent of the workflow; i.e. a
> micro-service implementation should be not be aware of which micro-service
> will be executed next and we should have a central control of deciding this
> pattern.
>
> Eg: For job type X, the pattern could be A -> B -> C -> D. Whereas for job
> type Y, the pattern could be A -> C -> D; and so on.
>
>
>
> An initial design with this idea looks like follows:
>
>
>
>
>
> We would have a common messaging framework (implementation has not been
> decided yet). The database associated with the workflow micro-service could
> be a graph database (maybe?) – again the implementation/technology has not
> been decided yet.
>
>
>
> This is just a proposed design, and I would love to hear your thoughts on
> this and any suggestions/comments if any. If there is anything that we are
> missing or should consider, please do let us know.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
> *From: *"Christie, Marcus Aaron" <machrist@iu.edu>
> *Reply-To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
> *Date: *Friday, February 3, 2017 at 9:21 AM
>
> *To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
> *Subject: *Re: [#Spring17-Airavata-Courses] : Distributed Workload
> Management for Airavata
>
>
>
> Vidya,
>
>
>
> I’m not sure how relevant it is, but it occurs to me that a microservice
> that executes jobs on a cloud requires very little in terms of resources to
> submit and monitor that job on the cloud. It doesn’t really matter if the
> job is a “big” or a “small” job.  So I’m not sure what heuristic makes
> sense regarding distributing work to these job execution microservices.
> Maybe a simple round robin approach would be sufficient.
>
>
>
> I think a job scheduling algorithm does make sense, however, for a higher
> level component, some sort of metascheduler that understands what resources
> are available on the cloud resources on which the jobs will be running.
> The metascheduler could create work for the job exection microservices to
> run on particular cloud resources in a way that optimizes for some metric
> (e.g., throughput).
>
>
>
> Thanks,
>
>
>
> Marcus
>
>
>
> On Feb 3, 2017, at 3:19 AM, Vidya Sagar Kalvakunta <vkalvaku@umail.iu.edu>
> wrote:
>
>
>
> Ajinkya,
>
>
>
> My scenario is for workload distribution among multiple instances of the
> same microservice.
>
>
>
> If a message broker needs to distribute the available jobs among multiple
> workers, the common approach would be to use round robin or a similar
> algorithm. This approach works best when all the workers are similar and
> the jobs are equal.
>
>
>
> So I think that a genetic or heuristic job scheduling algorithm, which is
> also aware of each of the worker's current state (CPU, RAM, No of Jobs
> processing) can more efficiently distribute the jobs. The workers can
> periodically ping the message broker with their current state info.
>
>
>
> The other advantage of using a customized algorithm is that it can
> be tweaked to use embedded routing, priority or other information in the
> job metadata to resolve all of the concerns raised by Amrutha viz message
> grouping, ordering, repeated messages, etc.
>
>
>
> We can even ensure data privacy, i.e if the workers are spread across
> multiple compute clusters say AWS and IU Big Red and we want to restrict
> certain sensitive jobs to be run only on Big Red.
>
>
>
> Some distributed job scheduling algorithms for cloud computing.
>
>    - http://www.ijimai.org/journal/sites/default/files/files/
>    2013/03/ijimai20132_18_pdf_62825.pdf
>    <http://www.ijimai.org/journal/sites/default/files/files/2013/03/ijimai20132_18_pdf_62825.pdf>
>    - https://arxiv.org/pdf/1404.5528.pdf
>
>
>
>
>
> Regards
>
> Vidya Sagar
>
>
>
> On Fri, Feb 3, 2017 at 1:38 AM, Kamat, Amruta Ravalnath <
> arkamat@indiana.edu> wrote:
>
> Hello all,
>
>
>
> Adding more information to the message based approach. Messaging is a key
> strategy employed in many distributed environments. Message queuing is
> ideally suited to performing asynchronous operations. A sender can post a
> message to a queue, but it does not have to wait while the message is
> retrieved and processed. A sender and receiver do not even have to be
> running concurrently.
>
>
>
> With message queuing there can be 2 possible scenarios:
>
>    1. ​Sending and receiving messages using a * single message queue.*
>    2. ​*Sharing a message queue* between many senders and receivers
>
> ​When a message is retrieved, it is removed from the queue. A message
> queue may also support message peeking. This mechanism can be useful if
> several receivers are retrieving messages from the same queue, but each
> receiver only wishes to handle specific messages. The receiver can examine
> the message it has peeked, and decide whether to retrieve the message
> (which removes it from the queue) or leave it on the queue for another
> receiver to handle.
>
>
>
> A few basic message queuing patterns are:
>
>    1. *One-way messaging*: The sender simply posts a message to the queue
>    in the expectation that a receiver will retrieve it and process it at some
>    point.
>    2. *Request/response messaging*: In this pattern a sender posts a
>    message to a queue and expects a response from the receiver. The sender can
>    resend if the message is not delivered. This pattern typically requires
>    some form of correlation to enable the sender to determine which response
>    message corresponds to which request sent to the receiver.
>    3. *Broadcast messaging*: In this pattern a sender posts a message to
>    a queue, and multiple receivers can read a copy of the message. This
>    pattern depends on the message queue being able to disseminate the same
>    message to multiple receivers. There is a queue to which the senders can
>    post messages that include metadata in the form of attributes. Each
>    receiver can create a subscription to the queue, specifying a filter that
>    examines the values of message attributes. Any messages posted to the
>    queue with attribute values that match the filter are automatically
>    forwarded to that subscription.
>
> A solution based on asynchronous messaging might need to address a number
> of concerns:
>
>
>
> *Message ordering, Message grouping: *Process messages either in the
> order they are posted or in a specific order based on priority. Also, there
> may be occasions when it is difficult to eliminate dependencies, and it may
> be necessary to group messages together so that they are all handled by the
> same receiver.
> *Idempotency: *Ideally the message processing logic in a receiver should
> be idempotent so that, if the work performed is repeated, this repetition
> does not change the state of the system.
> *Repeated messages: *Some message queuing systems implement duplicate
> message detection and removal based on message IDs
> *Poison messages: *A poison message is a message that cannot be handled,
> often because it is malformed or contains unexpected information.
> *Message expiration: *A message might have a limited lifetime, and if it
> is not processed within this period it might no longer be relevant and
> should be discarded.
> *Message scheduling: *A message might be temporarily embargoed and should
> not be processed until a specific date and time. The message should not be
> available to a receiver until this time.
>
>
> Thanks
>
> Amruta Kamat
>
>
> ------------------------------
>
> *From:* Shenoy, Gourav Ganesh <goshenoy@indiana.edu>
> *Sent:* Thursday, February 2, 2017 7:57 PM
> *To:* dev@airavata.apache.org
>
>
> *Subject:* Re: [#Spring17-Airavata-Courses] : Distributed Workload
> Management for Airavata
>
>
>
> Hello all,
>
>
>
> Amila, Sagar, thank you for the response and raising those concerns; and
> apologies because my email resonated the topic of workload management in
> terms of how micro-services communicate. As Ajinkya rightly mentioned,
> there exists some sort of correlation between micro-services communication
> and it’s impact on how that micro-service performs the work under those
> circumstances. The goal is to make sure we have maximum independence
> between micro-services, and investigate the workflow pattern in which these
> micro-services will operate such that we can find the right balance between
> availability & consistency. Again, from our preliminary analysis we can
> assert that these solutions may not be generic and the specific use-case
> will have a big decisive role.
>
>
>
> For starters, we are focusing on the following example – and I think this
> will clarify the doubts on what we are exactly trying to investigate about.
>
>
>
> *Our test example *
>
> Say we have the following 4 micro-services, which each perform a specific
> task as mentioned in the box.
>
>
>
> <image001.png>
>
>
>
>
>
> *A state-full pattern to distribute work*
>
> <image002.png>
>
>
>
> Here each communication between micro-services could be via RPC or
> Messaging (eg: RabbitMQ). Obvious disadvantage is that if any micro-service
> is down, then the system availability is at stake. In this test example, we
> can see that Microservice-A coordinates the work and maintains the state
> information.
>
>
>
> *A state-less pattern to distribute work*
>
>
>
> <image003.png>
>
>
>
> Another purely asynchronous approach would be to associate message-queues
> with each micro-service, where each micro-service performs it’s task,
> submits a request (message on bus) to the next micro-service, and continues
> to process more requests. This ensures more availability, and perhaps we
> might need to handle corner cases for failures such as message broker down,
> or message loss, etc.
>
>
>
> As mentioned, these are just a few proposals that we are planning to
> investigate via a prototype project. Inject corner cases/failures and try
> and find ways to handle these cases. I would love to hear more
> thoughts/questions/suggestions.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
> *From: *Ajinkya Dhamnaskar <adhamnas@umail.iu.edu>
> *Reply-To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
> *Date: *Thursday, February 2, 2017 at 2:22 AM
> *To: *"dev@airavata.apache.org" <dev@airavata.apache.org>
> *Subject: *Re: [#Spring17-Airavata-Courses] : Distributed Workload
> Management for Airavata
>
>
>
> Hello all,
>
>
>
> Just a heads up. Here the name Distributed workload management does not
> necessarily mean having different instances of a microservice and then
> distributing work among these instances.
>
>
>
> Apparently, the problem is how to make each microservice work
> independently with concrete distributed communication infrastructure. So,
> think of it as a workflow where each microservice does its part of work and
> communicates (how? yet to be decided) output. The next underlying
> microservice identifies and picks up that output and takes it further
> towards the final outcome, having said that, the crux here is, none of the
> miscoservices need to worry about other miscoservices in a pipeline.
>
>
>
> Vidya Sagar,
>
> I completely second your opinion of having stateless miscoservices, in
> fact that is the key. With stateless miscroservices it is difficult to
> guarantee consistency in a system but it solves the availability problem to
> some extent. I would be interested to understand what do you mean by "an
> intelligent job scheduling algorithm, which receives real-time updates from
> the microservices with their current state information".
>
>
>
> On Wed, Feb 1, 2017 at 11:48 PM, Vidya Sagar Kalvakunta <
> vkalvaku@umail.iu.edu> wrote:
>
>
>
> On Wed, Feb 1, 2017 at 2:37 PM, Amila Jayasekara <thejaka.amila@gmail.com>
> wrote:
>
> Hi Gourav,
>
>
>
> Sorry, I did not understand your question. Specifically I am having
> trouble relating "work load management" to options you suggest (RPC,
> message based etc.).
>
> So what exactly you mean by "workload management" ?
>
> What is work in this context ?
>
>
>
> Also, I did not understand what you meant by "the most efficient way".
> Efficient interms of what ? Are you looking at speed ?
>
>
>
> As per your suggestions, it seems you are trying to find a way to
> communicate between micro services. RPC might be troublesome if you need to
> communicate with processes separated from a firewall.
>
>
>
> Thanks
>
> -Thejaka
>
>
>
>
>
> On Wed, Feb 1, 2017 at 12:52 PM, Shenoy, Gourav Ganesh <
> goshenoy@indiana.edu> wrote:
>
> Hello dev, arch,
>
>
>
> As part of this Spring’17 Advanced Science Gateway Architecture course, we
> are working on trying to debate and find possible solutions to the issue of
> managing distributed workloads in Apache Airavata. This leads to the
> discussion of finding the most efficient way that different Airavata
> micro-services should communicate and distribute work, in such a way that:
>
> 1.       We maintain the ability to scale these micro-services whenever
> needed (autoscale perhaps?).
>
> 2.       Achieve fault tolerance.
>
> 3.       We can deploy these micro-services independently, or better in a
> containerized manner – keeping in mind the ability to use devops for
> deployment.
>
>
>
> As of now the options we are exploring are:
>
> 1.       RPC based communication
>
> 2.       Message based – either master-worker, or work-queue, etc
>
> 3.       A combination of both these approaches
>
>
>
> I am more inclined towards exploring the message based approach, but again
> there arises the possibility of handling limitations/corner cases of
> message broker such as downtimes (may be more). In my opinion, having
> asynchronous communication will help us achieve most of the above-mentioned
> points. Another debatable issue is making the micro-services implementation
> stateless, such that we do not have to pass the state information between
> micro-services.
>
>
>
> I would love to hear any thoughts/suggestions/comments on this topic and
> open up a discussion via this mail thread. If there is anything that I have
> missed which is relevant to this issue, please let me know.
>
>
>
> Thanks and Regards,
>
> Gourav Shenoy
>
>
>
>
>
> Hi Gourav,
>
>
>
> Correct me if I'm wrong, but I think this is a case of the job shop
> scheduling problem, as we may have 'n' jobs of varying processing times
> and memory requirements, and we have 'm' microservices with possibly
> different computing and memory capacities, and we are trying to minimize
> the makespan <https://en.wikipedia.org/wiki/Makespan>.
>
>
>
> For this use-case, I'm in favor a highly available and consistent message
> broker with an intelligent job scheduling algorithm, which receives
> real-time updates from the microservices with their current state
> information.
>
>
>
> As for the state vs stateless implementation, I think that question
> depends on the functionality of a particular microservice. In a broad
> sense, the stateless implementation should be preferred as it will scale
> better horizontally.
>
>
>
>
>
> Regards,
>
> Vidya Sagar
>
>
>
>
> --
>
> Vidya Sagar Kalvakunta | Graduate MS CS Student | IU School of Informatics
> and Computing | Indiana University Bloomington | (812) 691-5002
> <8126915002> | vkalvaku@iu.edu
>
>
>
>
>
> --
>
> Thanks and regards,
>
>
>
> Ajinkya Dhamnaskar
>
> Student ID : 0003469679
>
> Masters (CS)
>
> +1 (812) 369- 5416 <(812)%20369-5416>
>
>
>
>
>
> --
>
> Vidya Sagar Kalvakunta | Graduate MS CS Student | IU School of Informatics
> and Computing | Indiana University Bloomington | (812) 691-5002
> <8126915002> | vkalvaku@iu.edu
>
>
>



-- 
Thank you
Supun Nakandala
Dept. Computer Science and Engineering
University of Moratuwa

Mime
View raw message