Return-Path: X-Original-To: apmail-stratos-dev-archive@minotaur.apache.org Delivered-To: apmail-stratos-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3387617F49 for ; Tue, 10 Mar 2015 18:42:39 +0000 (UTC) Received: (qmail 22797 invoked by uid 500); 10 Mar 2015 18:42:29 -0000 Delivered-To: apmail-stratos-dev-archive@stratos.apache.org Received: (qmail 22742 invoked by uid 500); 10 Mar 2015 18:42:29 -0000 Mailing-List: contact dev-help@stratos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@stratos.apache.org Delivered-To: mailing list dev@stratos.apache.org Received: (qmail 22732 invoked by uid 99); 10 Mar 2015 18:42:29 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Mar 2015 18:42:29 +0000 Received: from mail-qc0-f170.google.com (mail-qc0-f170.google.com [209.85.216.170]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id E20151A02C0 for ; Tue, 10 Mar 2015 18:42:28 +0000 (UTC) Received: by qcvp6 with SMTP id p6so4285634qcv.5 for ; Tue, 10 Mar 2015 11:42:28 -0700 (PDT) X-Received: by 10.55.23.86 with SMTP id i83mr25078858qkh.28.1426012948044; Tue, 10 Mar 2015 11:42:28 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.84.138 with HTTP; Tue, 10 Mar 2015 11:42:07 -0700 (PDT) In-Reply-To: References: From: Imesh Gunaratne Date: Wed, 11 Mar 2015 00:12:07 +0530 Message-ID: Subject: Re: [Discuss] Stratos restart issues To: dev Content-Type: multipart/alternative; boundary=001a11470c8a36a4aa0510f38306 --001a11470c8a36a4aa0510f38306 Content-Type: text/plain; charset=UTF-8 IMO we could introduce a singleton Map to handle the states of each component, when each component becomes active they could update this map. When one component needs to talk to another the state can be checked. The components would include: SM, CC, AS, CEP and REST API. In a distributed setup this Map would be distributed. To avoid transport not being activated, each component could try to open up a socket to its service port until it becomes available and update the above Map. WDYT? Thanks On Tue, Mar 10, 2015 at 11:43 PM, Rajkumar Rajaratnam wrote: > > > On Tue, Mar 10, 2015 at 11:39 PM, Reka Thirunavukkarasu > wrote: > >> Hi Raj, >> >> I think the problem here is that the transport might not be ready even >> though the CC is activated. Until the transport ready, AS should not invoke >> CC. If we call CC via OSGI service, then this problem would not occur. >> Anyway, that is not possible when considering the distributed setup. I >> observed earlier that the ntask component usually scheduled the task after >> the transport is ready. In that case, CC can only send the CompleteTopology >> after ntask component schedule the task. Then, AS will receive Topology and >> start the monitors. In that case, i hope that there won't be any issues. >> However not sure whether ntask has a dependency to transport. >> >> So, are we receiving the CompleteTopology very early before the transport >> is started? >> > > Not too early. If I put a 2/3 sec sleep before making 1st call to CC, then > everything works fine. Because service is ready by that time. > > >> Thanks, >> Reka >> >> On Tue, Mar 10, 2015 at 1:26 AM, Rajkumar Rajaratnam >> wrote: >> >>> Hi Reka, >>> >>> I checked the flow and it is exactly same as you mentioned >> AS makes >>> 1st call to CC after CC component is activated. >>> Still AS is failing to communicate with CC. This is error message. This >>> error message means that the target is not available. But the target URL is >>> correct. It means CC service is not ready yet. I will look into it. >>> >>> >>> [2015-03-10 13:34:19,967] INFO >>> {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - WSO2 >>> Carbon started in 33 sec >>> [2015-03-10 13:34:20,075] INFO >>> {org.apache.axis2.transport.http.HTTPSender} - Unable to sendViaPost to >>> url[https://localhost:9443/services/CloudControllerService/] >>> org.apache.axis2.AxisFault: Transport error: 404 Error: Not Found >>> at >>> org.apache.axis2.transport.http.HTTPSender.handleResponse(HTTPSender.java:311) >>> at >>> org.apache.axis2.transport.http.HTTPSender.sendViaPost(HTTPSender.java:194) >>> at >>> org.apache.axis2.transport.http.HTTPSender.send(HTTPSender.java:75) >>> at >>> org.apache.axis2.transport.http.CommonsHTTPTransportSender.writeMessageWithCommons(CommonsHTTPTransportSender.java:451) >>> at >>> org.apache.axis2.transport.http.CommonsHTTPTransportSender.invoke(CommonsHTTPTransportSender.java:278) >>> at org.apache.axis2.engine.AxisEngine.send(AxisEngine.java:442) >>> at >>> org.apache.axis2.description.OutInAxisOperationClient.send(OutInAxisOperation.java:430) >>> at >>> org.apache.axis2.description.OutInAxisOperationClient.executeImpl(OutInAxisOperation.java:225) >>> at >>> org.apache.axis2.client.OperationClient.execute(OperationClient.java:149) >>> at >>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.getDeploymentPolicy(CloudControllerServiceStub.java:9172) >>> at >>> org.apache.stratos.common.client.CloudControllerServiceClient.getDeploymentPolicy(CloudControllerServiceClient.java:227) >>> at >>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getClusterMonitor(MonitorFactory.java:258) >>> at >>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getMonitor(MonitorFactory.java:83) >>> at >>> org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor$MonitorAdder.run(ParentComponentMonitor.java:789) >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> [2015-03-10 13:34:20,076] ERROR >>> {org.apache.stratos.autoscaler.monitor.MonitorFactory} - Error while >>> getting deployment policy from cloud controller [deployment-policy-id] >>> deployment-policy-1 >>> org.apache.axis2.AxisFault: Transport error: 404 Error: Not Found >>> at >>> org.apache.axis2.transport.http.HTTPSender.handleResponse(HTTPSender.java:311) >>> at >>> org.apache.axis2.transport.http.HTTPSender.sendViaPost(HTTPSender.java:194) >>> at >>> org.apache.axis2.transport.http.HTTPSender.send(HTTPSender.java:75) >>> at >>> org.apache.axis2.transport.http.CommonsHTTPTransportSender.writeMessageWithCommons(CommonsHTTPTransportSender.java:451) >>> at >>> org.apache.axis2.transport.http.CommonsHTTPTransportSender.invoke(CommonsHTTPTransportSender.java:278) >>> at org.apache.axis2.engine.AxisEngine.send(AxisEngine.java:442) >>> at >>> org.apache.axis2.description.OutInAxisOperationClient.send(OutInAxisOperation.java:430) >>> at >>> org.apache.axis2.description.OutInAxisOperationClient.executeImpl(OutInAxisOperation.java:225) >>> at >>> org.apache.axis2.client.OperationClient.execute(OperationClient.java:149) >>> at >>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.getDeploymentPolicy(CloudControllerServiceStub.java:9172) >>> at >>> org.apache.stratos.common.client.CloudControllerServiceClient.getDeploymentPolicy(CloudControllerServiceClient.java:227) >>> at >>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getClusterMonitor(MonitorFactory.java:258) >>> at >>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getMonitor(MonitorFactory.java:83) >>> at >>> org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor$MonitorAdder.run(ParentComponentMonitor.java:789) >>> at >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:262) >>> at >>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >>> at java.lang.Thread.run(Thread.java:745) >>> >>> Thanks. >>> >>> On Tue, Mar 10, 2015 at 12:42 PM, Rajkumar Rajaratnam < >>> rajkumarr@wso2.com> wrote: >>> >>>> >>>> >>>> On Tue, Mar 10, 2015 at 12:17 PM, Reka Thirunavukkarasu >>>> wrote: >>>> >>>>> According to my understanding, autoscaler was only dependent on >>>>> Topology in order to trigger anything(startup of monitors for the already >>>>> deployed application) in the restart. In that case, even though autoscaler >>>>> component starts first, it has to wait until the CompleteTopologyEvent is >>>>> received. At the moment autoscaler receives CompleteTopologyEvent, we can >>>>> assume that the CC is ready to process. >>>>> >>>>> It seems that now the flow has changed and autoscaler is no longer >>>>> dependent only on the CompleteTopology to trigger the startup of monitors >>>>> in the restart. Can't we make the autoscaler to only dependent on the >>>>> CompleteTopology rather than directly depending on CC? >>>>> >>>> >>>> Thanks Reka. That will solve the issue. I will have a look at the flow >>>> and fix it. >>>> >>>> Thanks. >>>> >>>> >>>>> >>>>> Thanks, >>>>> Reka >>>>> >>>>> On Mon, Mar 9, 2015 at 11:40 PM, Udara Liyanage >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> As already mentioned OSGI dependencies will not work in a distributed >>>>>> setup. Instead I prefer a event based mechanism. >>>>>> >>>>>> On Tue, Mar 10, 2015 at 12:06 PM, Rajkumar Rajaratnam < >>>>>> rajkumarr@wso2.com> wrote: >>>>>> >>>>>>> s/same machine/single JVM >>>>>>> >>>>>>> On Tue, Mar 10, 2015 at 11:51 AM, Rajkumar Rajaratnam < >>>>>>> rajkumarr@wso2.com> wrote: >>>>>>> >>>>>>>> Hi Isuru, >>>>>>>> >>>>>>>> Yes this happening when we are creating monitors by reading >>>>>>>> application topology. And I guess enforcing OSGi dependencies among >>>>>>>> components will completely break the distributed setup. Since components >>>>>>>> are not running in the same machine >> AS will be waiting forever for CC >>>>>>>> service to become active. >>>>>>>> >>>>>>>> As you said, it is better to go with event based solution. >>>>>>>> >>>>>>>> Thanks. >>>>>>>> >>>>>>>> On Tue, Mar 10, 2015 at 11:45 AM, Isuru Haththotuwa < >>>>>>>> isuruh@apache.org> wrote: >>>>>>>> >>>>>>>>> Hi Raj, >>>>>>>>> >>>>>>>>> On Tue, Mar 10, 2015 at 11:31 AM, Rajkumar Rajaratnam < >>>>>>>>> rajkumarr@wso2.com> wrote: >>>>>>>>> >>>>>>>>>> Hi Devs, >>>>>>>>>> >>>>>>>>>> I have found issues in stratos server restart. >>>>>>>>>> >>>>>>>>>> As you know we don't persist monitors. We read the topology and >>>>>>>>>> create monitors when we restart the Stratos. While we are creating >>>>>>>>>> monitors, we need to communicate with cloud controller service in-order to >>>>>>>>>> do things like getting deployment policy, network partitions, validating >>>>>>>>>> them and so on. In the single machine setup, AS component is starting >>>>>>>>>> before CC. So when AS tries to communicate with CC, it is failing >> >>>>>>>>>> ultimately monitor creation will fail. >>>>>>>>>> >>>>>>>>> Does this issue come in when we are creating Application Monitors? >>>>>>>>> AS starts before CC -> AS tries to restore the Application Monitors from >>>>>>>>> the local Applications Toplogy -> tries to communicate with CC -> leads to >>>>>>>>> the issue? >>>>>>>>> >>>>>>>>>> >>>>>>>>>> What would be solution here? Is there any way to enforce start up >>>>>>>>>> orders between components? I know we can use OSGI dependencies to enforce >>>>>>>>>> such order. We can make AS component to wait until CC component is >>>>>>>>>> activated. But will that solve the problem in distributed setup? >>>>>>>>>> >>>>>>>>> AFAIK enforcing the bundle startup order will not solve this in a >>>>>>>>> distributed setup. How about an event related solution? CC (or any other >>>>>>>>> related component) sending an event to say that it is started? To avoid the >>>>>>>>> deadlock in an event loss, maybe we can add a timeout as well. >>>>>>>>> >>>>>>>>>> >>>>>>>>>> Please share your thoughts on this. >>>>>>>>>> >>>>>>>>>> Thanks. >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Rajkumar Rajaratnam >>>>>>>>>> Committer & PMC Member, Apache Stratos >>>>>>>>>> Software Engineer, WSO2 >>>>>>>>>> >>>>>>>>>> Mobile : +94777568639 >>>>>>>>>> Blog : rajkumarr.com >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Thanks and Regards, >>>>>>>>>> >>>>>>>>>> Isuru H. >>>>>>>>>> >>>>>>>>>> +94 716 358 048 * * >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> * * >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Rajkumar Rajaratnam >>>>>>>> Committer & PMC Member, Apache Stratos >>>>>>>> Software Engineer, WSO2 >>>>>>>> >>>>>>>> Mobile : +94777568639 >>>>>>>> Blog : rajkumarr.com >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Rajkumar Rajaratnam >>>>>>> Committer & PMC Member, Apache Stratos >>>>>>> Software Engineer, WSO2 >>>>>>> >>>>>>> Mobile : +94777568639 >>>>>>> Blog : rajkumarr.com >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Udara Liyanage >>>>>> Software Engineer >>>>>> WSO2, Inc.: http://wso2.com >>>>>> lean. enterprise. middleware >>>>>> >>>>>> web: http://udaraliyanage.wordpress.com >>>>>> phone: +94 71 443 6897 >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Reka Thirunavukkarasu >>>>> Senior Software Engineer, >>>>> WSO2, Inc.:http://wso2.com, >>>>> Mobile: +94776442007 >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Rajkumar Rajaratnam >>>> Committer & PMC Member, Apache Stratos >>>> Software Engineer, WSO2 >>>> >>>> Mobile : +94777568639 >>>> Blog : rajkumarr.com >>>> >>> >>> >>> >>> -- >>> Rajkumar Rajaratnam >>> Committer & PMC Member, Apache Stratos >>> Software Engineer, WSO2 >>> >>> Mobile : +94777568639 >>> Blog : rajkumarr.com >>> >> >> >> >> -- >> Reka Thirunavukkarasu >> Senior Software Engineer, >> WSO2, Inc.:http://wso2.com, >> Mobile: +94776442007 >> >> >> > > > -- > Rajkumar Rajaratnam > Committer & PMC Member, Apache Stratos > Software Engineer, WSO2 > > Mobile : +94777568639 > Blog : rajkumarr.com > -- Imesh Gunaratne Technical Lead, WSO2 Committer & PMC Member, Apache Stratos --001a11470c8a36a4aa0510f38306 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
IMO we could introduce a singleton Map to handle the state= s of each component, when each component becomes active they could update t= his map. When one component needs to talk to another the state can be check= ed.=C2=A0

The components would include: SM, CC, AS, CEP = and REST API. In a distributed setup this Map would be distributed.
To avoid transport not being activated, each component could tr= y to open up a socket to its service port until it becomes available and up= date the above Map. WDYT?

Thanks
=

On Tue, Mar 10, 2= 015 at 11:43 PM, Rajkumar Rajaratnam <rajkumarr@wso2.com> w= rote:


On Tue, Mar 10= , 2015 at 11:39 PM, Reka Thirunavukkarasu <reka@wso2.com> wrote:=
Hi = Raj,

I think the problem here is that the transport might not = be ready even though the CC is activated. Until the transport ready, AS sho= uld not invoke CC. If we call CC via OSGI service, then this problem would = not occur. Anyway, that is not possible when considering the distributed se= tup. I observed earlier that the ntask component usually scheduled the task= after the transport is ready. In that case, CC can only send the CompleteT= opology after ntask component schedule the task. Then, AS will receive Topo= logy and start the monitors. In that case, i hope that there won't be a= ny issues. However not sure whether ntask has a dependency to transport.
So, are we receiving the CompleteTopology very early before the = transport is started?

Not too early. If I put a 2/3 sec sleep before making 1st call to C= C, then everything works fine. Because service is ready by that time.
<= br>

Thanks,
Reka

On Tue, Mar 10, 2015 at 1:= 26 AM, Rajkumar Rajaratnam <rajkumarr@wso2.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex">
Hi Reka,

I checked the flow and it is exactly same as you mentioned >> AS = makes 1st call to CC after CC component is activated.
Still AS is faili= ng to communicate with CC. This is error message. This error message means = that the target is not available. But the target URL is correct. It means C= C service is not ready yet. I will look into it.


[2015-03-10 13:= 34:19,967]=C2=A0 INFO {org.wso2.carbon.core.internal.StartupFinalizerServic= eComponent} -=C2=A0 WSO2 Carbon started in 33 sec
[2015-03-10 13:34:20,0= 75]=C2=A0 INFO {org.apache.axis2.transport.http.HTTPSender} -=C2=A0 Unable = to sendViaPost to url[https://localhost:9443/services/CloudContr= ollerService/]
org.apache.axis2.AxisFault: Transport error: 404 Erro= r: Not Found
=C2=A0=C2=A0=C2=A0 at org.apache.axis2.transport.http.HTTPS= ender.handleResponse(HTTPSender.java:311)
=C2=A0=C2=A0=C2=A0 at org.apac= he.axis2.transport.http.HTTPSender.sendViaPost(HTTPSender.java:194)
=C2= =A0=C2=A0=C2=A0 at org.apache.axis2.transport.http.HTTPSender.send(HTTPSend= er.java:75)
=C2=A0=C2=A0=C2=A0 at org.apache.axis2.transport.http.Common= sHTTPTransportSender.writeMessageWithCommons(CommonsHTTPTransportSender.jav= a:451)
=C2=A0=C2=A0=C2=A0 at org.apache.axis2.transport.http.CommonsHTTP= TransportSender.invoke(CommonsHTTPTransportSender.java:278)
=C2=A0=C2=A0= =C2=A0 at org.apache.axis2.engine.AxisEngine.send(AxisEngine.java:442)
= =C2=A0=C2=A0=C2=A0 at org.apache.axis2.description.OutInAxisOperationClient= .send(OutInAxisOperation.java:430)
=C2=A0=C2=A0=C2=A0 at org.apache.axis= 2.description.OutInAxisOperationClient.executeImpl(OutInAxisOperation.java:= 225)
=C2=A0=C2=A0=C2=A0 at org.apache.axis2.client.OperationClient.execu= te(OperationClient.java:149)
=C2=A0=C2=A0=C2=A0 at org.apache.stratos.cl= oud.controller.stub.CloudControllerServiceStub.getDeploymentPolicy(CloudCon= trollerServiceStub.java:9172)
=C2=A0=C2=A0=C2=A0 at org.apache.stratos.c= ommon.client.CloudControllerServiceClient.getDeploymentPolicy(CloudControll= erServiceClient.java:227)
=C2=A0=C2=A0=C2=A0 at org.apache.stratos.autos= caler.monitor.MonitorFactory.getClusterMonitor(MonitorFactory.java:258)
= =C2=A0=C2=A0=C2=A0 at org.apache.stratos.autoscaler.monitor.MonitorFactory.= getMonitor(MonitorFactory.java:83)
=C2=A0=C2=A0=C2=A0 at org.apache.stra= tos.autoscaler.monitor.component.ParentComponentMonitor$MonitorAdder.run(Pa= rentComponentMonitor.java:789)
=C2=A0=C2=A0=C2=A0 at java.util.concurren= t.Executors$RunnableAdapter.call(Executors.java:471)
=C2=A0=C2=A0=C2=A0 = at java.util.concurrent.FutureTask.run(FutureTask.java:262)
=C2=A0=C2=A0= =C2=A0 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecu= tor.java:1145)
=C2=A0=C2=A0=C2=A0 at java.util.concurrent.ThreadPoolExec= utor$Worker.run(ThreadPoolExecutor.java:615)
=C2=A0=C2=A0=C2=A0 at java.= lang.Thread.run(Thread.java:745)
[2015-03-10 13:34:20,076] ERROR {org.ap= ache.stratos.autoscaler.monitor.MonitorFactory} -=C2=A0 Error while getting= deployment policy from cloud controller [deployment-policy-id] deployment-= policy-1
org.apache.axis2.AxisFault: Transport error: 404 Error: Not Fou= nd
=C2=A0=C2=A0=C2=A0 at org.apache.axis2.transport.http.HTTPSender.hand= leResponse(HTTPSender.java:311)
=C2=A0=C2=A0=C2=A0 at org.apache.axis2.t= ransport.http.HTTPSender.sendViaPost(HTTPSender.java:194)
=C2=A0=C2=A0= =C2=A0 at org.apache.axis2.transport.http.HTTPSender.send(HTTPSender.java:7= 5)
=C2=A0=C2=A0=C2=A0 at org.apache.axis2.transport.http.CommonsHTTPTran= sportSender.writeMessageWithCommons(CommonsHTTPTransportSender.java:451)=C2=A0=C2=A0=C2=A0 at org.apache.axis2.transport.http.CommonsHTTPTransport= Sender.invoke(CommonsHTTPTransportSender.java:278)
=C2=A0=C2=A0=C2=A0 at= org.apache.axis2.engine.AxisEngine.send(AxisEngine.java:442)
=C2=A0=C2= =A0=C2=A0 at org.apache.axis2.description.OutInAxisOperationClient.send(Out= InAxisOperation.java:430)
=C2=A0=C2=A0=C2=A0 at org.apache.axis2.descrip= tion.OutInAxisOperationClient.executeImpl(OutInAxisOperation.java:225)
= =C2=A0=C2=A0=C2=A0 at org.apache.axis2.client.OperationClient.execute(Opera= tionClient.java:149)
=C2=A0=C2=A0=C2=A0 at org.apache.stratos.cloud.cont= roller.stub.CloudControllerServiceStub.getDeploymentPolicy(CloudControllerS= erviceStub.java:9172)
=C2=A0=C2=A0=C2=A0 at org.apache.stratos.common.cl= ient.CloudControllerServiceClient.getDeploymentPolicy(CloudControllerServic= eClient.java:227)
=C2=A0=C2=A0=C2=A0 at org.apache.stratos.autoscaler.mo= nitor.MonitorFactory.getClusterMonitor(MonitorFactory.java:258)
=C2=A0= =C2=A0=C2=A0 at org.apache.stratos.autoscaler.monitor.MonitorFactory.getMon= itor(MonitorFactory.java:83)
=C2=A0=C2=A0=C2=A0 at org.apache.stratos.au= toscaler.monitor.component.ParentComponentMonitor$MonitorAdder.run(ParentCo= mponentMonitor.java:789)
=C2=A0=C2=A0=C2=A0 at java.util.concurrent.Exec= utors$RunnableAdapter.call(Executors.java:471)
=C2=A0=C2=A0=C2=A0 at jav= a.util.concurrent.FutureTask.run(FutureTask.java:262)
=C2=A0=C2=A0=C2=A0= at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.ja= va:1145)
=C2=A0=C2=A0=C2=A0 at java.util.concurrent.ThreadPoolExecutor$W= orker.run(ThreadPoolExecutor.java:615)
=C2=A0=C2=A0=C2=A0 at java.lang.T= hread.run(Thread.java:745)

Thanks.

On Tue, Mar 10, 2015 at 1= 2:42 PM, Rajkumar Rajaratnam <rajkumarr@wso2.com> wrote:


On Tue, Mar 10, 2015 at 12:17 PM,= Reka Thirunavukkarasu <reka@wso2.com> wrote:
According to my understandi= ng, autoscaler was only=20 dependent on Topology in order to trigger anything(startup of monitors=20 for the already deployed application) in the restart. In that case, even though autoscaler component starts first, it has to wait until the=20 CompleteTopologyEvent is received. At the moment autoscaler receives=20 CompleteTopologyEvent, we can assume that the CC is ready to process.
It seems that now the flow has changed and autoscaler is no longer=20 dependent only on the CompleteTopology to trigger the startup of=20 monitors in the restart. Can't we make the autoscaler to only dependent= =20 on the CompleteTopology rather than directly depending on CC?

Thanks Reka. That will sol= ve the issue. I will have a look at the flow and fix it.

= Thanks.
=C2=A0

Thanks,
Reka

On Mon, Mar 9, 2015 = at 11:40 PM, Udara Liyanage <udara@wso2.com> wrote:
Hi,

As already mentioned OSGI dependencies will not work in a distribute= d setup. Instead I prefer a event based mechanism.

On Tue, Mar 10, 2015= at 12:06 PM, Rajkumar Rajaratnam <rajkumarr@wso2.com> wrot= e:
s/= same machine/single JVM

<= div class=3D"gmail_quote">On Tue, Mar 10, 2015 at 11:51 AM, Rajkumar Rajara= tnam <rajkumarr@wso2.com> wrote:
Hi Isuru,
Yes this happening when we are creating monitors by reading app= lication topology. And I guess enforcing OSGi dependencies among components= will completely break the distributed setup. Since components are not runn= ing in the same machine >> AS will be waiting forever for CC service = to become active.

As you said, it is better to go with event b= ased solution.

Thanks.

On Tue, Mar 10, 2015 at 11:45 = AM, Isuru Haththotuwa <isuruh@apache.org> wrote:
Hi Raj,

On Tue, Mar 10,= 2015 at 11:31 AM, Rajkumar Rajaratnam <rajkumarr@wso2.com>= wrote:
Hi Devs,

I have found issues in stratos = server restart.

As you know we don't persist monitor= s. We read the topology and create monitors when we restart the Stratos. Wh= ile we are creating monitors, we need to communicate with cloud controller = service in-order to do things like getting deployment policy, network parti= tions, validating them and so on. In the single machine setup, AS component= is starting before CC. So when AS tries to communicate with CC, it is fail= ing >> ultimately monitor creation will fail.
<= /div>
Does this issue come in when we are creating = Application Monitors? AS starts before CC -> AS tries to restore the App= lication Monitors from the local Applications Toplogy -> tries to commun= icate with CC -> leads to the issue?

= What would be solution here? Is there any way to enforce start up orders be= tween components? I know we can use OSGI dependencies to enforce such order= . We can make AS component to wait until CC component is activated. But wil= l that solve the problem in distributed setup?
AFAIK enforcing the bundle startup order will not solve t= his in a distributed setup. How about an event related solution? CC (or any= other related component) sending an event to say that it is started? To av= oid the deadlock in an event loss, maybe we can add a timeout as well.
<= /div>

Please share your thoughts on this.

Tha= nks.
<= div>

--
<= div>
Rajkumar Rajaratnam
Commit= ter & PMC Member, Apache Stratos
Software En= gineer, WSO2

Mobile : +94777568639



--
<= div>
Rajkumar Rajaratnam
=
Committer & PMC Member, Apache Stratos
Software Engineer, WSO2

= Blog : rajkumarr.com



--
Rajkumar Rajaratnam
Committer & PMC Member, Apache Stratos
=
Software Engineer, WSO2

Mob= ile : +94777568639



<= /div>--

Udara Liyanage
Software Engineer
WSO2, Inc.:=C2=A0htt= p://wso2.com
lean. enterprise. mid= dleware

phone:=C2=A0= +94 71 443 6897



--
Reka Thirunavukkarasu
Sen= ior Software Engineer,
WSO2, Inc.:http://wso2.com,
Mobile: +94776442007





--
<= div>
Rajkumar Rajara= tnam
Committer & PMC Member, Apache Stratos<= br>
Software Engineer, WSO2

Mobile : +94777568639
Blog : rajkumarr.com
=
=



--
<= div>
Rajkumar Rajaratnam
=
Committer & PMC Member, Apache Stratos
Software Engineer, WSO2

= Blog : rajkumarr.com



--
Reka Thirunavukkarasu
Senior Software Engineer,
WSO2,= Inc.:http://wso2.com,
Mobile: +94776442007





--
Rajkumar Rajaratnam
Committer & PM= C Member, Apache Stratos
Software Engineer, WSO2=

Mobile : +94777568639
=



--
Imesh Gunaratne

Te= chnical Lead, WSO2
Committer & PMC Member, Apach= e Stratos
--001a11470c8a36a4aa0510f38306--