stratos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Imesh Gunaratne <im...@apache.org>
Subject Re: [Discuss] Stratos restart issues
Date Tue, 10 Mar 2015 18:42:07 GMT
IMO we could introduce a singleton Map to handle the states of each
component, when each component becomes active they could update this map.
When one component needs to talk to another the state can be checked.

The components would include: SM, CC, AS, CEP and REST API. In a
distributed setup this Map would be distributed.

To avoid transport not being activated, each component could try to open up
a socket to its service port until it becomes available and update the
above Map. WDYT?

Thanks

On Tue, Mar 10, 2015 at 11:43 PM, Rajkumar Rajaratnam <rajkumarr@wso2.com>
wrote:

>
>
> On Tue, Mar 10, 2015 at 11:39 PM, Reka Thirunavukkarasu <reka@wso2.com>
> wrote:
>
>> Hi Raj,
>>
>> I think the problem here is that the transport might not be ready even
>> though the CC is activated. Until the transport ready, AS should not invoke
>> CC. If we call CC via OSGI service, then this problem would not occur.
>> Anyway, that is not possible when considering the distributed setup. I
>> observed earlier that the ntask component usually scheduled the task after
>> the transport is ready. In that case, CC can only send the CompleteTopology
>> after ntask component schedule the task. Then, AS will receive Topology and
>> start the monitors. In that case, i hope that there won't be any issues.
>> However not sure whether ntask has a dependency to transport.
>>
>> So, are we receiving the CompleteTopology very early before the transport
>> is started?
>>
>
> Not too early. If I put a 2/3 sec sleep before making 1st call to CC, then
> everything works fine. Because service is ready by that time.
>
>
>> Thanks,
>> Reka
>>
>> On Tue, Mar 10, 2015 at 1:26 AM, Rajkumar Rajaratnam <rajkumarr@wso2.com>
>> wrote:
>>
>>> Hi Reka,
>>>
>>> I checked the flow and it is exactly same as you mentioned >> AS makes
>>> 1st call to CC after CC component is activated.
>>> Still AS is failing to communicate with CC. This is error message. This
>>> error message means that the target is not available. But the target URL is
>>> correct. It means CC service is not ready yet. I will look into it.
>>>
>>>
>>> [2015-03-10 13:34:19,967]  INFO
>>> {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} -  WSO2
>>> Carbon started in 33 sec
>>> [2015-03-10 13:34:20,075]  INFO
>>> {org.apache.axis2.transport.http.HTTPSender} -  Unable to sendViaPost to
>>> url[https://localhost:9443/services/CloudControllerService/]
>>> org.apache.axis2.AxisFault: Transport error: 404 Error: Not Found
>>>     at
>>> org.apache.axis2.transport.http.HTTPSender.handleResponse(HTTPSender.java:311)
>>>     at
>>> org.apache.axis2.transport.http.HTTPSender.sendViaPost(HTTPSender.java:194)
>>>     at
>>> org.apache.axis2.transport.http.HTTPSender.send(HTTPSender.java:75)
>>>     at
>>> org.apache.axis2.transport.http.CommonsHTTPTransportSender.writeMessageWithCommons(CommonsHTTPTransportSender.java:451)
>>>     at
>>> org.apache.axis2.transport.http.CommonsHTTPTransportSender.invoke(CommonsHTTPTransportSender.java:278)
>>>     at org.apache.axis2.engine.AxisEngine.send(AxisEngine.java:442)
>>>     at
>>> org.apache.axis2.description.OutInAxisOperationClient.send(OutInAxisOperation.java:430)
>>>     at
>>> org.apache.axis2.description.OutInAxisOperationClient.executeImpl(OutInAxisOperation.java:225)
>>>     at
>>> org.apache.axis2.client.OperationClient.execute(OperationClient.java:149)
>>>     at
>>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.getDeploymentPolicy(CloudControllerServiceStub.java:9172)
>>>     at
>>> org.apache.stratos.common.client.CloudControllerServiceClient.getDeploymentPolicy(CloudControllerServiceClient.java:227)
>>>     at
>>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getClusterMonitor(MonitorFactory.java:258)
>>>     at
>>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getMonitor(MonitorFactory.java:83)
>>>     at
>>> org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor$MonitorAdder.run(ParentComponentMonitor.java:789)
>>>     at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> [2015-03-10 13:34:20,076] ERROR
>>> {org.apache.stratos.autoscaler.monitor.MonitorFactory} -  Error while
>>> getting deployment policy from cloud controller [deployment-policy-id]
>>> deployment-policy-1
>>> org.apache.axis2.AxisFault: Transport error: 404 Error: Not Found
>>>     at
>>> org.apache.axis2.transport.http.HTTPSender.handleResponse(HTTPSender.java:311)
>>>     at
>>> org.apache.axis2.transport.http.HTTPSender.sendViaPost(HTTPSender.java:194)
>>>     at
>>> org.apache.axis2.transport.http.HTTPSender.send(HTTPSender.java:75)
>>>     at
>>> org.apache.axis2.transport.http.CommonsHTTPTransportSender.writeMessageWithCommons(CommonsHTTPTransportSender.java:451)
>>>     at
>>> org.apache.axis2.transport.http.CommonsHTTPTransportSender.invoke(CommonsHTTPTransportSender.java:278)
>>>     at org.apache.axis2.engine.AxisEngine.send(AxisEngine.java:442)
>>>     at
>>> org.apache.axis2.description.OutInAxisOperationClient.send(OutInAxisOperation.java:430)
>>>     at
>>> org.apache.axis2.description.OutInAxisOperationClient.executeImpl(OutInAxisOperation.java:225)
>>>     at
>>> org.apache.axis2.client.OperationClient.execute(OperationClient.java:149)
>>>     at
>>> org.apache.stratos.cloud.controller.stub.CloudControllerServiceStub.getDeploymentPolicy(CloudControllerServiceStub.java:9172)
>>>     at
>>> org.apache.stratos.common.client.CloudControllerServiceClient.getDeploymentPolicy(CloudControllerServiceClient.java:227)
>>>     at
>>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getClusterMonitor(MonitorFactory.java:258)
>>>     at
>>> org.apache.stratos.autoscaler.monitor.MonitorFactory.getMonitor(MonitorFactory.java:83)
>>>     at
>>> org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor$MonitorAdder.run(ParentComponentMonitor.java:789)
>>>     at
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>     at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>>     at java.lang.Thread.run(Thread.java:745)
>>>
>>> Thanks.
>>>
>>> On Tue, Mar 10, 2015 at 12:42 PM, Rajkumar Rajaratnam <
>>> rajkumarr@wso2.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Mar 10, 2015 at 12:17 PM, Reka Thirunavukkarasu <reka@wso2.com>
>>>> wrote:
>>>>
>>>>> According to my understanding, autoscaler was only dependent on
>>>>> Topology in order to trigger anything(startup of monitors for the already
>>>>> deployed application) in the restart. In that case, even though autoscaler
>>>>> component starts first, it has to wait until the CompleteTopologyEvent
is
>>>>> received. At the moment autoscaler receives CompleteTopologyEvent, we
can
>>>>> assume that the CC is ready to process.
>>>>>
>>>>> It seems that now the flow has changed and autoscaler is no longer
>>>>> dependent only on the CompleteTopology to trigger the startup of monitors
>>>>> in the restart. Can't we make the autoscaler to only dependent on the
>>>>> CompleteTopology rather than directly depending on CC?
>>>>>
>>>>
>>>> Thanks Reka. That will solve the issue. I will have a look at the flow
>>>> and fix it.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Reka
>>>>>
>>>>> On Mon, Mar 9, 2015 at 11:40 PM, Udara Liyanage <udara@wso2.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> As already mentioned OSGI dependencies will not work in a distributed
>>>>>> setup. Instead I prefer a event based mechanism.
>>>>>>
>>>>>> On Tue, Mar 10, 2015 at 12:06 PM, Rajkumar Rajaratnam <
>>>>>> rajkumarr@wso2.com> wrote:
>>>>>>
>>>>>>> s/same machine/single JVM
>>>>>>>
>>>>>>> On Tue, Mar 10, 2015 at 11:51 AM, Rajkumar Rajaratnam <
>>>>>>> rajkumarr@wso2.com> wrote:
>>>>>>>
>>>>>>>> Hi Isuru,
>>>>>>>>
>>>>>>>> Yes this happening when we are creating monitors by reading
>>>>>>>> application topology. And I guess enforcing OSGi dependencies
among
>>>>>>>> components will completely break the distributed setup. Since
components
>>>>>>>> are not running in the same machine >> AS will be waiting
forever for CC
>>>>>>>> service to become active.
>>>>>>>>
>>>>>>>> As you said, it is better to go with event based solution.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> On Tue, Mar 10, 2015 at 11:45 AM, Isuru Haththotuwa <
>>>>>>>> isuruh@apache.org> wrote:
>>>>>>>>
>>>>>>>>> Hi Raj,
>>>>>>>>>
>>>>>>>>> On Tue, Mar 10, 2015 at 11:31 AM, Rajkumar Rajaratnam
<
>>>>>>>>> rajkumarr@wso2.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Devs,
>>>>>>>>>>
>>>>>>>>>> I have found issues in stratos server restart.
>>>>>>>>>>
>>>>>>>>>> As you know we don't persist monitors. We read the
topology and
>>>>>>>>>> create monitors when we restart the Stratos. While
we are creating
>>>>>>>>>> monitors, we need to communicate with cloud controller
service in-order to
>>>>>>>>>> do things like getting deployment policy, network
partitions, validating
>>>>>>>>>> them and so on. In the single machine setup, AS component
is starting
>>>>>>>>>> before CC. So when AS tries to communicate with CC,
it is failing >>
>>>>>>>>>> ultimately monitor creation will fail.
>>>>>>>>>>
>>>>>>>>> Does this issue come in when we are creating Application
Monitors?
>>>>>>>>> AS starts before CC -> AS tries to restore the Application
Monitors from
>>>>>>>>> the local Applications Toplogy -> tries to communicate
with CC -> leads to
>>>>>>>>> the issue?
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What would be solution here? Is there any way to
enforce start up
>>>>>>>>>> orders between components? I know we can use OSGI
dependencies to enforce
>>>>>>>>>> such order. We can make AS component to wait until
CC component is
>>>>>>>>>> activated. But will that solve the problem in distributed
setup?
>>>>>>>>>>
>>>>>>>>> AFAIK enforcing the bundle startup order will not solve
this in a
>>>>>>>>> distributed setup. How about an event related solution?
CC (or any other
>>>>>>>>> related component) sending an event to say that it is
started? To avoid the
>>>>>>>>> deadlock in an event loss, maybe we can add a timeout
as well.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Please share your thoughts on this.
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Rajkumar Rajaratnam
>>>>>>>>>> Committer & PMC Member, Apache Stratos
>>>>>>>>>> Software Engineer, WSO2
>>>>>>>>>>
>>>>>>>>>> Mobile : +94777568639
>>>>>>>>>> Blog : rajkumarr.com
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> <http://rajkumarr.com>
>>>>>>>>>> <http://rajkumarr.com>
>>>>>>>>>> Thanks and Regards,
>>>>>>>>>>
>>>>>>>>>> Isuru H.
>>>>>>>>>> <http://rajkumarr.com>
>>>>>>>>>> +94 716 358 048 <http://rajkumarr.com>* <http://wso2.com/>*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> * <http://wso2.com/>*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Rajkumar Rajaratnam
>>>>>>>> Committer & PMC Member, Apache Stratos
>>>>>>>> Software Engineer, WSO2
>>>>>>>>
>>>>>>>> Mobile : +94777568639
>>>>>>>> Blog : rajkumarr.com
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Rajkumar Rajaratnam
>>>>>>> Committer & PMC Member, Apache Stratos
>>>>>>> Software Engineer, WSO2
>>>>>>>
>>>>>>> Mobile : +94777568639
>>>>>>> Blog : rajkumarr.com
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Udara Liyanage
>>>>>> Software Engineer
>>>>>> WSO2, Inc.: http://wso2.com
>>>>>> lean. enterprise. middleware
>>>>>>
>>>>>> web: http://udaraliyanage.wordpress.com
>>>>>> phone: +94 71 443 6897
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Reka Thirunavukkarasu
>>>>> Senior Software Engineer,
>>>>> WSO2, Inc.:http://wso2.com,
>>>>> Mobile: +94776442007
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Rajkumar Rajaratnam
>>>> Committer & PMC Member, Apache Stratos
>>>> Software Engineer, WSO2
>>>>
>>>> Mobile : +94777568639
>>>> Blog : rajkumarr.com
>>>>
>>>
>>>
>>>
>>> --
>>> Rajkumar Rajaratnam
>>> Committer & PMC Member, Apache Stratos
>>> Software Engineer, WSO2
>>>
>>> Mobile : +94777568639
>>> Blog : rajkumarr.com
>>>
>>
>>
>>
>> --
>> Reka Thirunavukkarasu
>> Senior Software Engineer,
>> WSO2, Inc.:http://wso2.com,
>> Mobile: +94776442007
>>
>>
>>
>
>
> --
> Rajkumar Rajaratnam
> Committer & PMC Member, Apache Stratos
> Software Engineer, WSO2
>
> Mobile : +94777568639
> Blog : rajkumarr.com
>



-- 
Imesh Gunaratne

Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Mime
View raw message