stratos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Imesh Gunaratne <im...@apache.org>
Subject Re: Testing Stratos 4.1 : nested grouping scenario with startup and termination issues (?)
Date Fri, 01 May 2015 05:09:51 GMT
In addition we have not added a try catch block in MonitorAdder.run()
method to cover its full scope. Therefore if an exception is raised in the
middle the above problem also can cause.

I have now fixed this in commit revision:
9ec061f44a3189ccd8b509ef4da980687dfbcf62

Martin: Appreciate if you could take this fix and retest.

Thanks

On Fri, May 1, 2015 at 10:32 AM, Imesh Gunaratne <imesh@apache.org> wrote:

> Hi Reka,
>
> It looks like the MonitorAdder.run() has executed properly, that's why we
> see the following log:
>
> TID: [0] [STRATOS] [2015-04-30 16:48:57,712]  INFO {org.apache.stratos.
> autoscaler.monitor.component.ParentComponentMonitor} -  Starting monitor:
> [type] cluster [component] sub-G1-G2-G3-1-G4.c4-1x1.c4.domain
>
> However the thread has not come to its last line:
>
> log.info(String.format("Monitor started successfully: [type] %s [component] %s [dependents]
%s " +
>                 "[startup-time] %d seconds", monitorTypeStr, context.getId(),
>
>
> As we discussed offline this may have caused by a deadlock while trying to
> get the following topology lock:
>
> public static ClusterMonitor getClusterMonitor(ParentComponentMonitor parentMonitor,
>                                                ClusterChildContext context,
>                                                List<String> parentInstanceIds)
>     ...
>     //acquire read lock for the service and cluster
>     TopologyManager.acquireReadLockForCluster(serviceName, clusterId);
>
>
> Martin: Will you be able to do another test run by enabling deadlock
> detection logic. You could set the following system property to true in the
> stratos.sh file to do this:
>
> read.write.lock.monitor.enabled=true
>
> Thanks
>
>
> On Fri, May 1, 2015 at 7:40 AM, Reka Thirunavukkarasu <reka@wso2.com>
> wrote:
>
>> Hi Martin,
>>
>> Thanks Martin for the detailed information in order to analyze the issue.
>> It helped to isolate the issue.
>>
>> As i went through the logs, it seems that some thread issue. I could see
>> below log for c4-1x1 and c3-1x1. In that case c3 and c4 got scheduled to be
>> start a relevant clusterMonitor. After that only c3 got successfully
>> started with ClusterMonitor not c4. So the scheduler of c4 didn't actually
>> start a thread for the MonitorAdder to create the ClusterMonitor.
>>
>> TID: [0] [STRATOS] [2015-04-30 16:48:57,712]  INFO
>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} -
>> Starting dependent monitor: [application] sub-G1-G2-G3-1-G4 [component]
>> sub-G1-G2-G3-1-G4.c4-1x1.c4.domain
>> TID: [0] [STRATOS] [2015-04-30 16:48:57,712]  INFO
>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} -
>> Monitor scheduled: [type] cluster [component] sub-G1-G2-G3-1-G4.c4-1x1.c4.domain
>>
>> TID: [0] [STRATOS] [2015-04-30 16:48:57,712]  INFO
>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} -
>> Starting monitor: [type] cluster [component]
>> sub-G1-G2-G3-1-G4.c4-1x1.c4.domain
>> TID: [0] [STRATOS] [2015-04-30 16:48:57,713]  INFO
>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} -
>> Starting dependent monitor: [application] sub-G1-G2-G3-1-G4 [component]
>> sub-G1-G2-G3-1-G4.c3-1x1.c3.domain
>> TID: [0] [STRATOS] [2015-04-30 16:48:57,713]  INFO
>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} -
>> Monitor scheduled: [type] cluster [component] sub-G1-G2-G3-1-G4.c3-1x1.c3.domain
>>
>> TID: [0] [STRATOS] [2015-04-30 16:48:57,713]  INFO
>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} -
>> Starting monitor: [type] cluster [component]
>> sub-G1-G2-G3-1-G4.c3-1x1.c3.domain
>>
>> Found below log for c3 which indicates that c3 monitor got started
>> successfully. But there is no such log for c4.
>>
>> TID: [0] [STRATOS] [2015-04-30 16:49:00,760]  INFO
>> {org.apache.stratos.autoscaler.monitor.component.ParentComponentMonitor} -
>> Monitor started successfully: [type] cluster [component]
>> sub-G1-G2-G3-1-G4.c3-1x1.c3.domain [dependents] none [startup-time] 3
>> seconds
>>
>> @Gayan/Imesh, Do you have any input here? Will increasing the threadpool
>> solve this issue? Or is it related to something else?
>>
>> Thanks,
>> Reka
>>
>>
>>
>> On Thu, Apr 30, 2015 at 10:54 PM, Martin Eppel (meppel) <meppel@cisco.com
>> > wrote:
>>
>>>  Hi Reka,
>>>
>>>
>>>
>>> Re-run the scenario, making sure the application alias and group alias
>>> are as suggested and debug logs are turned on (see config below)
>>>
>>>
>>>
>>> log4j.logger.org.apache.stratos.manager=DEBUG
>>>
>>> log4j.logger.org.apache.stratos.autoscaler=DEBUG
>>>
>>> log4j.logger.org.apache.stratos.messaging=INFO
>>>
>>> log4j.logger.org.apache.stratos.cloud.controller=DEBUG
>>>
>>> log4j.logger.org.wso2.andes.client=ERROR
>>>
>>>
>>>
>>> This is the scenario:
>>>
>>>
>>>
>>> 1.      deployed application – see screenshot A. , debug logs
>>> wso2carbon-debug.log
>>> only 3 instances spin up
>>>
>>> 2.      removed application
>>>
>>> 3.      re-deployed application – see screenshot B. , debug logs
>>> wso2carbon-debug-2.log
>>> (after line “TID: [0] [STRATOS] [2015-04-30 17:05:23,837] DEBUG
>>> {org.apache.stratos.autoscaler.applications.ApplicationHolder} -  Read lock
>>> released”
>>> 2nd time the application gets deployed all instances spin up and go
>>> active
>>>
>>>
>>>
>>>
>>>
>>> Please see attached artifacts and logs.
>>>
>>>
>>>
>>> A.     Application Status after deploying the application first time
>>> after stratos start up:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> B.     Application Status after re-deploying the application
>>>
>>> (see log wso2carbon-debug-2.log after “TID: [0] [STRATOS] [2015-04-30
>>> 17:05:23,837] DEBUG
>>> {org.apache.stratos.autoscaler.applications.ApplicationHolder} -  Read lock
>>> released”:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *From:* Reka Thirunavukkarasu [mailto:reka@wso2.com]
>>> *Sent:* Thursday, April 30, 2015 1:40 AM
>>>
>>> *To:* dev
>>> *Subject:* Re: Testing Stratos 4.1 : nested grouping scenario with
>>> startup and termination issues (?)
>>>
>>>
>>>
>>> If you get this issue continuously, can you please share the logs
>>> against master as we have improved some logs in the master yesterday?
>>>
>>> Thanks,
>>>
>>> Reka
>>>
>>>
>>>
>>> On Thu, Apr 30, 2015 at 2:08 PM, Reka Thirunavukkarasu <reka@wso2.com>
>>> wrote:
>>>
>>> Hi Martin,
>>>
>>> I have deployed the attached samples as earlier in openstack with latest
>>> master. All the clusters got created with the members. Please see the
>>> attached diagram. I'm unable to proceed further as my puppet configuration
>>> has to be corrected to make the member active. Thought of sharing this as
>>> all the clusters have members.
>>>
>>> Thanks,
>>>
>>> Reka
>>>
>>>
>>>
>>> On Thu, Apr 30, 2015 at 10:25 AM, Reka Thirunavukkarasu <reka@wso2.com>
>>> wrote:
>>>
>>> HI Martin,
>>>
>>> Can you please confirm whether you are using unique applicationId and
>>> group alias? I can see from the UI, the applicationID and next group alias
>>> are same value as sub-G1-G2-G3-1..
>>>
>>> Thanks,
>>>
>>> Reka
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Apr 30, 2015 at 10:16 AM, Martin Eppel (meppel) <
>>> meppel@cisco.com> wrote:
>>>
>>> Hi Reka,
>>>
>>>
>>>
>>> I have upgraded from beta to the latest stratos code on master and
>>> retested the scenario from jira STRATOS-1345 but still see the same issue
>>> (on open stack)
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Martin
>>>
>>>
>>>
>>>
>>>
>>> *From:* Martin Eppel (meppel)
>>> *Sent:* Wednesday, April 29, 2015 2:54 PM
>>> *To:* dev@stratos.apache.org
>>> *Subject:* RE: Testing Stratos 4.1 : nested grouping scenario with
>>> startup and termination issues (?)
>>>
>>>
>>>
>>> Hi Reka,
>>>
>>>
>>>
>>> I will upgrade my system to the latest master and re-test,
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>>
>>> Martin
>>>
>>>
>>>
>>> *From:* Reka Thirunavukkarasu [mailto:reka@wso2.com <reka@wso2.com>]
>>> *Sent:* Wednesday, April 29, 2015 11:55 AM
>>> *To:* dev
>>> *Subject:* Re: Testing Stratos 4.1 : nested grouping scenario with
>>> startup and termination issues (?)
>>>
>>>
>>>
>>> Hi Martin,
>>>
>>> While i was working on Application update, i fixed few issues with the
>>> termination behavior. Anyway there seems to be small issues in the logic
>>> which has to be fixed. I have started to verify this in my local setup. Can
>>> you create a jira? So that we can track it. I will update the progress in
>>> the jira..
>>>
>>> Thanks,
>>>
>>> Reka
>>>
>>>
>>>
>>> On Tue, Apr 28, 2015 at 10:11 PM, Martin Eppel (meppel) <
>>> meppel@cisco.com> wrote:
>>>
>>> Hi Reka,
>>>
>>>
>>>
>>> Thanks for following up - let me know if I should open a JIRA,
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Martin
>>>
>>>
>>>
>>> *From:* Reka Thirunavukkarasu [mailto:reka@wso2.com]
>>> *Sent:* Tuesday, April 28, 2015 5:37 AM
>>> *To:* dev
>>> *Subject:* Re: Testing Stratos 4.1 : nested grouping scenario with
>>> startup and termination issues (?)
>>>
>>>
>>>
>>> Hi Martin,
>>>
>>> Thanks for bringing this up. I have fixed some issue in the flow while
>>> testing application update support with instances count. I will go through
>>> your scenarios to reproduce it and update the thread with the progress..
>>>
>>> Thanks,
>>>
>>> Reka
>>>
>>>
>>>
>>> On Tue, Apr 28, 2015 at 7:08 AM, Martin Eppel (meppel) <meppel@cisco.com>
>>> wrote:
>>>
>>> I am testing a (nested grouping) scenario where a group defines a
>>> termination behavior “terminate-all”. When terminating the instance (of
>>> cartridge type c3), no new instance is restarted.
>>>
>>> My understanding is that a new instance should be started up.
>>>
>>>
>>>
>>> The scenario looks like this:
>>>
>>>
>>>
>>> Group ~G1 has a cartridge member c1 and group member ~G2
>>>
>>> Group ~G2 has a cartridge member c2 and group member ~G3
>>>
>>> Group ~G3 has a cartridge member c3
>>>
>>>
>>>
>>> Startup dependencies are: c1 depends on G2, c2 depends on G3
>>>
>>>
>>>
>>> ~G1 defines termination: none
>>>
>>> ~G2 defines termination: dependents
>>>
>>> ~G3 defines termination: all
>>>
>>>
>>>
>>> After startup, when all instances are active, instance c3 is terminated
>>> which correctly also terminates also instance c2 (since it depends on G3 /
>>> c3) .
>>>
>>> *Issue 1:*
>>>
>>> However, no new instances for c3 is started up (consequently no new
>>> instance for c2 should be started up as well) (see log see log
>>> wso2carbon.log)
>>>
>>>
>>>
>>> Only instance which remains running is c1.
>>>
>>> *Issue 2:*
>>>
>>> When subsequently c1 is manually being terminated, a new instance of c1
>>> is started up (as opposed to Issue1) which I think is incorrect since it
>>> defines a startup dependency (c1 depends on G2) which is not fulfilled at
>>> the time (G2 should not be active since c2 is still terminated, see log
>>> wso2carbon-issue2.log, same log as wso2carbon.log but at a later time)
>>>
>>>
>>>
>>> WDYT ?
>>>
>>>
>>>
>>> Please find attached artifacts and logs
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Martin
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Reka Thirunavukkarasu
>>> Senior Software Engineer,
>>> WSO2, Inc.:http://wso2.com,
>>>
>>> Mobile: +94776442007
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Reka Thirunavukkarasu
>>> Senior Software Engineer,
>>> WSO2, Inc.:http://wso2.com,
>>>
>>> Mobile: +94776442007
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Reka Thirunavukkarasu
>>> Senior Software Engineer,
>>> WSO2, Inc.:http://wso2.com,
>>>
>>> Mobile: +94776442007
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Reka Thirunavukkarasu
>>> Senior Software Engineer,
>>> WSO2, Inc.:http://wso2.com,
>>>
>>> Mobile: +94776442007
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Reka Thirunavukkarasu
>>> Senior Software Engineer,
>>> WSO2, Inc.:http://wso2.com,
>>>
>>> Mobile: +94776442007
>>>
>>>
>>>
>>
>>
>>
>> --
>> Reka Thirunavukkarasu
>> Senior Software Engineer,
>> WSO2, Inc.:http://wso2.com,
>> Mobile: +94776442007
>>
>>
>>
>
>
> --
> Imesh Gunaratne
>
> Senior Technical Lead, WSO2
> Committer & PMC Member, Apache Stratos
>



-- 
Imesh Gunaratne

Senior Technical Lead, WSO2
Committer & PMC Member, Apache Stratos

Mime
View raw message