Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 25788200BAB for ; Sat, 8 Oct 2016 03:46:39 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 24064160AE9; Sat, 8 Oct 2016 01:46:39 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D25B2160AE8 for ; Sat, 8 Oct 2016 03:46:35 +0200 (CEST) Received: (qmail 39943 invoked by uid 500); 8 Oct 2016 01:46:35 -0000 Mailing-List: contact dev-help@airavata.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@airavata.apache.org Delivered-To: mailing list dev@airavata.apache.org Received: (qmail 39930 invoked by uid 99); 8 Oct 2016 01:46:34 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 08 Oct 2016 01:46:34 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 2A1CAC028B for ; Sat, 8 Oct 2016 01:46:34 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.629 X-Spam-Level: ** X-Spam-Status: No, score=2.629 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id SgwtP0f9hrfV for ; Sat, 8 Oct 2016 01:46:29 +0000 (UTC) Received: from mail-ua0-f180.google.com (mail-ua0-f180.google.com [209.85.217.180]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 4B0505F56F for ; Sat, 8 Oct 2016 01:46:28 +0000 (UTC) Received: by mail-ua0-f180.google.com with SMTP id r64so57671411uar.3 for ; Fri, 07 Oct 2016 18:46:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=/PlHvjWVCvRXNZMYu12B1cHaSiVsrLPHH8hAVr68eL8=; b=U8MxwXiKbVUTrMnWV4jEeZC3X4BInB+dgsw3SK/tin25x6WdbJBD6q66ebbLg8xB5s VDSSguK3ayHo8vaaoX7Zks0GKqqca0Zh/tnI9+TGV/AfE/3dPs0NZOPLgyqNCWC46Xn1 krT8nbGM35kbcROZDwDqNHCgu9ZXFHrAIjZODWbjYOQ9ac1/jOO5pX1IgiWlQdp/53J4 /AjHx9wsnp4fcQo2TQxbfwxjPZpJmUE66boZBXiRguI7Hou5wpNmZO9c3SuUCz/I8Bom 6D3UQJaxisN4afVdPXfslN4zs7byRIV/a+4h7N/2KpvaEeLImejPa9/xn9TYDc8FsckR f4qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=/PlHvjWVCvRXNZMYu12B1cHaSiVsrLPHH8hAVr68eL8=; b=CiNAtFDGAoifSLKOTYY9g4ktDB9RIsMWkhLIh/ksf1zElch/Sc4RsfKuV2J0GYUh/Y 6JpVjtw2ddFOtLsqyRCcFxierurvh2tAjmpYj6iKqERlIzmvCvb10l2bwNQVsnqNSwp5 q6njFDlV5UnsCv4x1NZViDAVG/aD3VoLUgxiwtWe/G/TGcEFuj1biyJRdsNSiCERgPIN v8MPyjf56gk46EQFOGEDHg/cKUl8u65Ei7NJxmkW15YjI7f/7ydaLha99GLsQ1gJyVyB XPpTTH/LtSXIxdWYWNmYHQrQjr3/mMz3H+KvVhmVXwKXnYJmP/70UKE6Mc6QzAcl3lcF TK5Q== X-Gm-Message-State: AA6/9RkseM7lCi7orHXlSErlJajk/9IUlSntXrC481M6sBVFit/iMHT/WOfpAMhARRxbun3lJcqm5p57jNgwzw== X-Received: by 10.176.84.65 with SMTP id o1mr16985712uaa.122.1475891187072; Fri, 07 Oct 2016 18:46:27 -0700 (PDT) MIME-Version: 1.0 Received: by 10.176.2.214 with HTTP; Fri, 7 Oct 2016 18:46:06 -0700 (PDT) In-Reply-To: References: <1F6EC734-A975-4D9A-99C7-14414555EF85@apache.org> From: Supun Kamburugamuve Date: Fri, 7 Oct 2016 21:46:06 -0400 Message-ID: Subject: Re: Work Stealing is not a good solution for Airavata. To: dev Content-Type: multipart/alternative; boundary=94eb2c1b0974eec4e8053e50b1db archived-at: Sat, 08 Oct 2016 01:46:39 -0000 --94eb2c1b0974eec4e8053e50b1db Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Shameera, The plan we discussed offline is good one and it should solve most of these issues discussed here. If you can describe that plan to others also, we may be able to come to a good conclusion. Thanks, Supun.. On Fri, Oct 7, 2016 at 5:07 PM, Shameera Rathnayaka wrote: > It seems I might have missed some important parts in Airavata > Architecture. I think it is time for me to step back and re-evaluate agai= n > why I see these big whole in our design. Thanks Amila for following this > thread so far. > > Thanks, > Shameera. > > On Fri, Oct 7, 2016 at 12:50 PM Shameera Rathnayaka < > shameerainfo@gmail.com> wrote: > >> On Fri, Oct 7, 2016 at 12:18 PM Amila Jayasekara >> wrote: >> >> You misunderstood. >> >> Let me re-phrase what mentioned earlier: (You said) "No, it is still >> valid, thread goes to thread pool doesn't say *worker* is complete that >> request, *it* is waiting until actual hpc job runs on target computer >> resoruces". >> Try to read above statement and explain to me what you refer as "worker" >> and "it" (in bold) ? >> >> >> Worker mean, who consume from the queue (to be specific GFac). "It" mean= s >> Worker. >> >> >> Here is how I understood above statement. You submit a job in a thread >> and then you put that thread to "wait" state until job finishes. Is that >> correct ? >> >> No, this is bad we are not putting thread to wait. we release it untill >> the submitted job finish, next time GFac start to process the same reque= st, >> it can be same thread or different thread from thread pool. >> >> >> Also, others let me know whether I read the statement incorrectly. >> >> -AJ >> >> On Fri, Oct 7, 2016 at 11:46 AM, Shameera Rathnayaka < >> shameerainfo@gmail.com> wrote: >> >> glad you got that, also don't you see that when the message comes(very >> first right arrow) and when gfac send the ack for that message(very last >> left arrow)? That is the lifetime of a one worker queue message. >> >> On Fri, Oct 7, 2016 at 9:57 AM Amila Jayasekara >> wrote: >> >> As per this diagram, it seems the thread that submits the job is not the >> same as the thread that handles output. >> At least that is what I understand. >> >> -AJ >> >> On Thu, Oct 6, 2016 at 4:26 PM, Shameera Rathnayaka < >> shameerainfo@gmail.com> wrote: >> >> Previous attachment doesn't work. >> >> On Thu, Oct 6, 2016 at 4:24 PM, Shameera Rathnayaka < >> shameerainfo@gmail.com> wrote: >> >> [image: Work Queue Message Life time.png]Hi Amila, >> >> Please find work queue message execution sequence diagram below. Hope >> this will help to understand how it works in Airavata. >> >> >> >> On Thu, Oct 6, 2016 at 4:05 PM Suresh Marru wrote: >> >> Just a quick top post. This is informative discussion, please continue := ) >> >> I agree on that Airavata does not do Work Stealing but it implements >> "Work Queues=E2=80=9D. Conceptually they are similar to the OS Kernel le= vel work >> queens, but more in a distributed context - https://www.kernel.org/doc/ >> Documentation/workqueue.txt >> >> Suresh >> >> >> On Oct 6, 2016, at 3:52 PM, Amila Jayasekara >> wrote: >> >> On Thu, Oct 6, 2016 at 3:17 PM, Shameera Rathnayaka < >> shameerainfo@gmail.com> wrote: >> >> >> >> On Thu, Oct 6, 2016 at 2:50 PM Amila Jayasekara >> wrote: >> >> On Thu, Oct 6, 2016 at 11:07 AM, Shameera Rathnayaka < >> shameerainfo@gmail.com> wrote: >> >> Hi Amila, >> >> -- Please explain how you used "work stealing" in distributed system. >> That would be interesting. >> >> >> Airavata depends on work stealing + amqp for followings, >> Fault Tolerance - This is one of major distributed system problem which >> critical in Airavata, What ever the reason experiment request processing >> shouldn't get any effect from internal node failure. Even with the node >> failures, Airavata should be capable enough to continue experiment reque= st >> processing or hold it until at least one node appear and then continue. = How >> this is handled in Ariavata is, worker only ack for messages only after = it >> completely processed it. If the node goes down without sendings ack for >> the messages it was processing,then rabbitmq put all these un-ack messag= es >> back to the queue and available to consume again. >> >> Resource Utilization- Another important goal of distributed system to >> effectively use available resources in the system, namely the memory and >> processors of components. In Airavata this will decide the throughput a= nd >> response time of experiments. Currently, at a given time workers only ge= t >> messages up to a preconfigured limit (the limit is prefetch count) But m= ost >> of these jobs are async jobs. That means after worker gets fixed amount = of >> jobs, it won't get any other jobs even worker capable or handling more >> jobs, waste of worker resources. >> >> >> >> You still did not answer my question. I want to know how you used "work >> stealing" in your implementation. In other words how distributed work >> stealing works in your implementation. The details you gave above is >> unrelated and does not answer my question. >> >> >> I think I have explained, how we use work stealing (work queues). If you >> are finding a more analog solution to parallel computing work strealing >> then that is hard to explain. >> >> >> No, you have not. :-). >> Work stealing !=3D work queues. In a distributed setting I would image >> following kind of a work stealing implementation; Every worker >> (orchestrator) maintains a request queue locally and it serve requests >> coming to the local queue. Whenever one worker runs out of more requests= to >> serve it will query other distributed workers local queues to see whethe= r >> there are requests that it can serve. If there are it can steal requests >> from other workers local queues and process. However, this model of >> computation is in efficient to do in a distributed environment. I guess >> that is the same reason we dont find much distributed work stealing >> implementations. >> >> Anyhow lets stop the discussion about work stealing now. :-) >> >> >> >> >> >> >> >> >> -- I dont see AMQP in the architecture diagram you attached above and I >> dont understand why Airavata has to depend on it. One way to figure this >> out is think about the architecture without AMQP and figure out what >> actually should happend and look for a way to do that using AMQP. >> >> >> Worker Queue is AMQP queue. >> >> >> Does the worker queue needs to be an AMQP queue ? Sorry, I dont know muc= h >> about AMQP but it sounds like limitations you are explaining are because= of >> AMQP. >> >> >> It is not, but good to use well-defined protocol instead of custom one. >> Almost all messaging systems have implemented AMQP protocol. >> >> >> Can we figure out whether others have also encountered the same/similar >> problem and how they tackled those with AMQP ? Cos the design we have is >> pretty straightforward and I believe there are systems analogous to our >> design that uses AMQP. >> >> >> >> >> >> >> >> -- Does this mean that you have a waiting thread or process within >> Airavata after submitting the job (for each work) ? >> >> >> No, once the job is submitted to the remote resource, thread goes back t= o >> the thread pool. >> >> >> Then, your previous explanation, (i.e., "The time needs for a worker to >> finish the work is depend on the application run time (applications runs= on >> HPC machine). Theoretically, this can be from few sec to days or even >> more."), invalidates. Correct ? >> >> >> No, it is still valid, thread goes to thread pool doesn't say worker is >> complete that request, it is waiting until actual hpc job runs on target >> computer resoruces. After this hpc jobs completed then outptu data stagi= ng >> happens. After output stage to storage then it ack to the work queue >> message. >> >> >> This is confusing to me. >> Does this mean once you return thread to thread pool, it is not reusable >> for another request ? Also, how do you wait on a thread after returning = it >> to the thread pool ? >> Also, why do you have to wait for HPC job to complete ? I was under the >> impression the communication is asynchronous. i.e. after job completes y= ou >> get an email confirmation and then you start output data staging in a >> separate thread. >> >> We should probably meet and verbally discuss this. >> >> -AJ >> >> >> >> Thanks, >> Shameera. >> >> >> >> >> >> Thanks, >> Shameera. >> >> >> >> It takes more time for me to digest following right now. I will try to >> give more feedback when I properly understand them. >> >> Thanks >> -Amila >> >> -- > Shameera Rathnayaka > --=20 Supun Kamburugamuve Member, Apache Software Foundation; http://www.apache.org E-mail: supun@apache.o rg; Mobile: +1 812 219 2563 --94eb2c1b0974eec4e8053e50b1db Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Shameera,

The plan we discussed offl= ine is good one and it should solve most of these issues discussed here. If= you can describe that plan to others also, we may be able to come to a goo= d conclusion.

Thanks,
Supun..

On Fri, Oct 7, 2= 016 at 5:07 PM, Shameera Rathnayaka <shameerainfo@gmail.com> wrote:
It seems I = might have missed some important parts in Airavata Architecture. I think it= is time for me to step back and re-evaluate again why I see these big whol= e in our design. Thanks Amila for following this thread so far.=C2=A0
<= br>
Thanks,=C2=A0
Shameera.

On Fri, Oct 7, 2016 at 12:50 PM Shameera = Rathnayaka <= shameerainfo@gmail.com> wrote:
On Fri, Oct 7, 2016 at 12:18 PM Amila Jayasek= ara <thejaka.amila@gmail.com> wrote:<= br class=3D"m_-1073178833594279335gmail_msg">
You misunderstood.

Let me re-phrase what mentioned earlie= r: (You said) "No, it is still valid, thread goes t= o thread pool doesn't say = worker is complete that request, it is waiting until actual hpc job runs on target computer resor= uces".=C2=A0
Try to = read above statement and explain to me what you refer as "worker"= and "it" (in bold) ?

<= div class=3D"m_-1073178833594279335gmail_msg">Worker mean, who consume from= the queue (to be specific GFac). "It" means Worker.=C2=A0

= = Here is h= ow I understood above statement. You submit a job in a thread and then you = put that thread to "wait" state until job finishes. Is that corre= ct ?
No, t= his is bad we are not putting thread to wait. we release it untill the subm= itted job finish, next time GFac start to process the same request, =C2=A0i= t can be same thread or different thread from thread pool.=C2=A0
=C2=A0
Also, others let me know wheth= er I read the statement incorrectly.

=
-AJ

On Fri, Oct 7, 2016 at 11:46 AM, Shameera Rathnayaka <= shameerainfo@gmail.com> wrote:
glad you got = that, also don't you see that when the message comes(very first right a= rrow) and when gfac send the ack for that message(very last left arrow)?=C2= =A0 That is the lifetime of a one worker queue message.=C2=A0

On Fri, Oct 7, 2016 at 9:57 AM Amila Jayasekara <thejaka.amila@gmail.com> wrote:
As per this diagram, it seems the thread that submits= the job is not the same as the thread that handles output.
At least that is what= I understand.

-AJ

On Thu, Oct 6, 2016 at 4:26 P= M, Shameera Rathnayaka <shameerainfo@gmail.com> wrote:
Previous attachment doesn't work.

On Thu, Oct 6, 2016 at 4:24 PM, Shameera Rathnayaka <<= a href=3D"mailto:shameerainfo@gmail.com" class=3D"m_-1073178833594279335m_-= 9121931684569224698m_-6467820249997811077m_-7958301675003183265gmail_msg m_= -1073178833594279335gmail_msg" target=3D"_blank">shameerainfo@gmail.com= > wrote:
3D"WorkHi Amila,=C2=A0

Please find work queue message execution seq= uence diagram below. Hope this will help to understand how it works in Aira= vata.

=


On Thu, Oct 6, 2016 at 4:05 PM Suresh Marru <smarru@apache.org> wrote:
Just a quick top post. This is informativ= e discussion, please continue :)

I agree on tha= t Airavata does not do Work Stealing but it implements "Work Queues=E2= =80=9D. Conceptually they are similar to the OS Kernel level work queens, b= ut more in a distributed context -=C2=A0https://www.kernel.org= /doc/Documentation/workqueue.txt

Suresh
=C2=A0

On Oct 6, 2016, a= t 3:52 PM, Amila Jayasekara <thejaka.amila@gmail.com> wrote:

<= /div>
On Thu, Oct 6, 2016 at 3:17 PM, Shameera Rathnayaka <sha= meerainfo@gmail.com> wrote:


On Thu, Oct 6, 2016= at 2:50 PM Amila Jayasekara <thejaka.amila@gmail.com> wrote:
On Thu, Oc= t 6, 2016 at 11:07 AM, Shameera Rathnayaka <shameerainfo@gmail.com> wrote= :
Hi Amila,=C2=A0

-- Please explain how you used "work stea= ling" in distributed system. That would be interesting.

Airavata depends on work stealing + amqp for followings,= =C2=A0
Fault Tolerance - This is one of major distributed system problem w= hich critical in Airavata, What ever the reason experiment request processi= ng shouldn't get any effect =C2=A0from internal node failure. Even with= the node failures, Airavata should be capable enough to continue experimen= t request processing or hold it until at least one node appear and then con= tinue. How this is handled in Ariavata is, worker only ack for messages onl= y after it completely processed it. If the node goes down without sendings = =C2=A0ack for the messages it was processing,then rabbitmq put all these un= -ack messages back to the queue and available to consume again.

Resource Uti= lization- Another important goal of distributed system to effectively use a= vailable resources in the system, namely the memory and processors of compo= nents.=C2=A0 In Airavata this will decide the throughput and response time = of experiments. Currently, at a given time workers only get messages up to = a preconfigured limit (the limit is prefetch count) But most of these jobs = are async jobs. That means after worker gets fixed amount of jobs, it won&#= 39;t get any other jobs even worker capable or handling more jobs, waste of= worker resources.


You still did not answer my question. I want to know how = you used "work stealing" in your implementation. In other words h= ow distributed work stealing works in your implementation. The details =C2= =A0you gave above is unrelated and does not answer my question.=C2=A0
=

I think = I have explained, how we use work stealing (work queues). If you are findin= g a more analog solution to parallel computing work strealing then that is = hard to explain. =C2=A0

=
No, you have not. :-).=C2=A0
Work stealing !=3D work q= ueues. In a distributed setting I would image following kind of a work stea= ling implementation; Every worker (orchestrator) maintains a request queue = locally and it serve requests coming to the local queue. Whenever one worke= r runs out of more requests to serve it will query other distributed worker= s local queues to see whether there are requests that it can serve. If ther= e are it can steal requests from other workers local queues and process. Ho= wever, this model of computation is in efficient to do in a distributed env= ironment. I guess that is the same reason we dont find much distributed wor= k stealing implementations.=C2=A0

Anyhow = lets stop the discussion about work stealing now. :-)
=C2=A0

<= blockquote class=3D"gmail_quote m_-1073178833594279335m_-912193168456922469= 8m_-6467820249997811077m_-7958301675003183265m_-7889666612447633703m_470712= 8080611523422m_-4643050835576585528gmail_msg m_-1073178833594279335m_-91219= 31684569224698m_-6467820249997811077m_-7958301675003183265gmail_msg m_-1073= 178833594279335gmail_msg" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px= solid rgb(204,204,204);padding-left:1ex">
=C2=A0
<= blockquote class=3D"gmail_quote m_-1073178833594279335m_-912193168456922469= 8m_-6467820249997811077m_-7958301675003183265m_-7889666612447633703m_470712= 8080611523422m_-4643050835576585528m_-2104547726497296027gmail-m_-695935936= 7221536728gmail_msg m_-1073178833594279335m_-9121931684569224698m_-64678202= 49997811077m_-7958301675003183265m_-7889666612447633703m_470712808061152342= 2m_-4643050835576585528gmail_msg m_-1073178833594279335m_-91219316845692246= 98m_-6467820249997811077m_-7958301675003183265gmail_msg m_-1073178833594279= 335gmail_msg" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(2= 04,204,204);padding-left:1ex">
=C2=A0

-- I do= nt see AMQP in the architecture diagram you attached above and I dont under= stand why Airavata has to depend on it. One way to figure this out is think= about the architecture without AMQP and figure out what actually should ha= ppend and look for a way to do that using AMQP.=C2=A0

Worker Queue is AMQP queue.=C2=A0

Does the worker queue needs to= be an AMQP queue ? Sorry, I dont know much about AMQP but it sounds like l= imitations you are explaining are because of AMQP.
<= /blockquote>

It is not, but good to use = well-defined protocol instead of custom one. Almost all messaging systems h= ave implemented AMQP protocol.

Can we figure out whether others have also encountered the s= ame/similar problem and how they tackled those with AMQP ? Cos the design w= e have is pretty straightforward and I believe there are systems analogous = to our design that uses AMQP.=C2=A0
=C2=A0
=C2=A0
=C2=A0
-- Does th= is mean that you have a waiting thread or process within Airavata after sub= mitting the job (for each work) ?=C2=A0
No, once the job is submitted to the remote resource, thread goes back to = the thread pool.=C2=A0

<= div dir=3D"ltr" class=3D"m_-1073178833594279335m_-9121931684569224698m_-646= 7820249997811077m_-7958301675003183265m_-7889666612447633703m_4707128080611= 523422m_-4643050835576585528m_-2104547726497296027gmail-m_-6959359367221536= 728gmail_msg m_-1073178833594279335m_-9121931684569224698m_-646782024999781= 1077m_-7958301675003183265m_-7889666612447633703m_4707128080611523422m_-464= 3050835576585528gmail_msg m_-1073178833594279335m_-9121931684569224698m_-64= 67820249997811077m_-7958301675003183265gmail_msg m_-1073178833594279335gmai= l_msg">
Then, your previous explanation, (i.e., "The time needs for a worker to finish the work is depend on t= he application run time (applications runs on HPC machine). Theoretically, = this can be from few sec to days or even more."), invalidates. Correct= ?

<= div class=3D"m_-1073178833594279335m_-9121931684569224698m_-646782024999781= 1077m_-7958301675003183265m_-7889666612447633703m_4707128080611523422m_-464= 3050835576585528gmail_msg m_-1073178833594279335m_-9121931684569224698m_-64= 67820249997811077m_-7958301675003183265gmail_msg m_-1073178833594279335gmai= l_msg">No, it is still valid, thread goes to thread pool doesn't say wo= rker is complete that request, it is waiting until actual hpc job runs on t= arget computer resoruces. After this hpc jobs completed then outptu data st= aging happens. After output stage to storage then it ack to the work queue = message.

This is= confusing to me.
Does this mean once you return thread to thr= ead pool, it is not reusable for another request ? Also, how do you wait on= a thread after returning it to the thread pool ?=C2=A0
Also, = why do you have to wait for HPC job to complete ? I was under the impressio= n the communication is asynchronous. i.e. after job completes you get an em= ail confirmation and then you start output data staging in a separate threa= d.

We should probably meet and verbally d= iscuss this.

-AJ
=C2=A0
<= br class=3D"m_-1073178833594279335m_-9121931684569224698m_-6467820249997811= 077m_-7958301675003183265m_-7889666612447633703m_4707128080611523422m_-4643= 050835576585528gmail_msg m_-1073178833594279335m_-9121931684569224698m_-646= 7820249997811077m_-7958301675003183265gmail_msg m_-1073178833594279335gmail= _msg">
Thanks,=C2=A0
Shameera.
=C2=A0
=C2=A0

Thanks,=C2=A0
Shameera.
=C2=A0
=

It takes more time for me to digest foll= owing right now. I will try to give more feedback when I properly understan= d them.

Thanks
-Amila=C2=A0
=
<= /div>
=
<= /div>
=
--
Shameera Rathnayaka



--
Supun Kamburugamuve
Member, Apache Software Founda= tion; http://www.apache= .org
E-mail: = supun@apache.org; =C2=A0Mobile: +1 812 219 2563


=
--94eb2c1b0974eec4e8053e50b1db--