stratos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Eppel (meppel)" <mep...@cisco.com>
Subject RE: Testing Stratos 4.1: Application undeployment: application fails to undeploy (nested grouping, group scaling)
Date Tue, 23 Jun 2015 05:19:02 GMT
Thanks Reka

From: Reka Thirunavukkarasu [mailto:reka@wso2.com]
Sent: Monday, June 22, 2015 9:59 PM
To: Martin Eppel (meppel)
Cc: dev@stratos.apache.org; Lasindu Charith (lasindu@wso2.com); Ryan Du Plessis (rdupless)
Subject: Re: Testing Stratos 4.1: Application undeployment: application fails to undeploy
(nested grouping, group scaling)

Hi Martin,
These are actually configurable parameters. In the stratos code, these thread pool sizes are
set to 20 by default. If we need to change it, then we can pass those as system properties
in our stratos.sh. Since default values are taken by stratos code, we don't need to provide
this in the standalone pack. When there is a complex application with more groups and clusters,
there will be more use of threads. In that case, the default pool size of 20 might get exhausted.
So, it would be better to have this properties customized according to the application structure.
I faced some issues like events listeners didn't get triggered properly due to thread pool
got exhausted with threads when i used the application sample that you have attached to this
thread. After i increase the thread pool size to 50, i didn't get any issues.
I'm in the process of analyzing the thread usage in order to decide on the recommended pool
size along with application structure. So that anyone can calculate the correct pool size
that they require according to the application and configure this parameter.
Hope this will help you to understand on those parameters.

Thanks,
Reka

On Mon, Jun 22, 2015 at 11:50 PM, Martin Eppel (meppel) <meppel@cisco.com<mailto:meppel@cisco.com>>
wrote:
Hi Reka,

I am not clear on the 2 properties you mention below, are they supposed to be set in the stratos.sh
? I just picked up the latest code and from the apache stratos repo and don’t see them ?

Btw,  read.write.lock.monitor.enabled=false  is disabled in our production code (I assume
it is set to false by default if not specified) , I only enable it to provide additional information

Thanks

Martin

From: Reka Thirunavukkarasu [mailto:reka@wso2.com<mailto:reka@wso2.com>]
Sent: Monday, June 22, 2015 7:30 AM
To: Martin Eppel (meppel)
Cc: dev@stratos.apache.org<mailto:dev@stratos.apache.org>; Lasindu Charith (lasindu@wso2.com<mailto:lasindu@wso2.com>);
Ryan Du Plessis (rdupless)

Subject: Re: Testing Stratos 4.1: Application undeployment: application fails to undeploy
(nested grouping, group scaling)

Hi Martin,
I have verified the fix by enabling read.write.lock.monitor.enabled=true. The fix worked fine
with it. Since we are using concurrency and delegated some flow to Threads, i had to provide
the thread values to below values in the stratos.sh.

    -Dapplication.monitor.thread.pool.size=50 \
    -Dgroup.monitor.thread.pool.size=50 \
Please note that it is recommended to have read.write.lock.monitor.enabled=false as it will
consume more footprint in the production. This property introduce only for the testing purpose.

We are in the process of analyzing the thread size and will come up with a recommended values
for it.
Also, i have fixed a small issue in the REST endpoint as it returns some default value whenever
application run time is not found. Now that if runtime is not found, the below message will
get populated.

{"status":"error","message":"Application runtime not found"}
I have also verified the undeployment with group scaling. Didn't find any issues with the
above fixes.
Please find the latest commit as below:

0a969200d11228158606f011ca7e5e795f336d92.
Please note that below error was only observed which is harmless for now. I have verified
it with a workaround and working fine. But will check on the severity and decide on a proper
fix or will go with the workaround.

[1]. TID: [0] [STRATOS] [2015-06-22 14:22:01,872] ERROR {org.apache.stratos.common.concurrent.locks.ReadWriteLockMonitor}
-  System error, lock has not released for 30 seconds: [lock-name] topology [lock-type] Write
[thread-id] 117 [thread-name] pool-24-thread-2 [stack-trace]
java.lang.Thread.getStackTrace(Thread.java:1589)
org.apache.stratos.common.concurrent.locks.ReadWriteLock.acquireWriteLock(ReadWriteLock.java:123)
org.apache.stratos.messaging.message.processor.topology.updater.TopologyUpdater.acquireWriteLockForService(TopologyUpdater.java:123)
org.apache.stratos.messaging.message.processor.topology.ApplicationClustersCreatedMessageProcessor.doProcess(ApplicationClustersCreatedMessageProcessor.java:78)
org.apache.stratos.messaging.message.processor.topology.ApplicationClustersCreatedMessageProcessor.process(ApplicationClustersCreatedMessageProcessor.java:59)
org.apache.stratos.messaging.message.processor.topology.ServiceRemovedMessageProcessor.process(ServiceRemovedMessageProcessor.java:64)
org.apache.stratos.messaging.message.processor.topology.ServiceCreatedMessageProcessor.process(ServiceCreatedMessageProcessor.java:65)
org.apache.stratos.messaging.message.processor.topology.CompleteTopologyMessageProcessor.process(CompleteTopologyMessageProcessor.java:74)
org.apache.stratos.messaging.message.processor.MessageProcessorChain.process(MessageProcessorChain.java:61)
org.apache.stratos.messaging.message.receiver.topology.TopologyEventMessageDelegator.run(TopologyEventMessageDelegator.java:73)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)
Thanks,
Reka



On Mon, Jun 22, 2015 at 12:24 PM, Reka Thirunavukkarasu <reka@wso2.com<mailto:reka@wso2.com>>
wrote:
Hi Martin,
Found the reason why we didn't encounter these locking issue as we were testing with default
stratos pack which has read.write.lock.monitor.enabled=false. The locking warning or issue
is raised only when you use read.write.lock.monitor.enabled=true. That's why you were only
facing these locking issue as you use this configuration in your setup.
Since I'm able to reproduce the issue, i will test with the fix that i already pushed and
update the thread.
We will discuss and try to make this read.write.lock.monitor.enabled=true by default with
stratos. So that we can find issues as early and fix them.

Thanks,
Reka

On Mon, Jun 22, 2015 at 12:16 AM, Reka Thirunavukkarasu <reka@wso2.com<mailto:reka@wso2.com>>
wrote:
Sorry Martin..I have only locally fixed the issue. I have pushed it now only. Can you test
with 1c21daaeea7b27ad0a0141a70b32e9443e78e309 when you get chance? I will also continue testing
with this fix.
Thanks,
Reka

On Mon, Jun 22, 2015 at 12:07 AM, Martin Eppel (meppel) <meppel@cisco.com<mailto:meppel@cisco.com>>
wrote:
Btw,

This is my last commit I picked up from the stratos master:

commit 58bea52be814269416f70391fef50859aa5ae0a1
Author: lasinducharith <lasinducharith@gmail.com<mailto:lasinducharith@gmail.com>>
Date:   Fri Jun 19 19:40:27 2015 +0530

From: Martin Eppel (meppel)
Sent: Sunday, June 21, 2015 10:28 AM
To: dev@stratos.apache.org<mailto:dev@stratos.apache.org>; Reka Thirunavukkarasu
Cc: Lasindu Charith (lasindu@wso2.com<mailto:lasindu@wso2.com>); Ryan Du Plessis (rdupless)
Subject: RE: Testing Stratos 4.1: Application undeployment: application fails to undeploy
(nested grouping, group scaling)

Hi Reka,

Here is another example which fails, see application at [1.], attached log files and jsons.
 I run a few scenarios, the one which is failing is application with the name “s-g-c1-c2-c3”
(last scenario). All members get removed but application remains deployed,

“s-g-c1-c2-c3: applicationInstances 0, groupInstances 0, clusterInstances 0, members 0 ()”


Thanks


Martin




[cid:image001.png@01D0AD39.674B17A0]




From: Imesh Gunaratne [mailto:imesh@apache.org]
Sent: Sunday, June 21, 2015 1:32 AM
To: Reka Thirunavukkarasu
Cc: dev; Lasindu Charith (lasindu@wso2.com<mailto:lasindu@wso2.com>); Ryan Du Plessis
(rdupless)
Subject: Re: Testing Stratos 4.1: Application undeployment: application fails to undeploy
(nested grouping, group scaling)

Great! Thanks Reka!

On Sun, Jun 21, 2015 at 8:34 AM, Reka Thirunavukkarasu <reka@wso2.com<mailto:reka@wso2.com>>
wrote:
Hi Martin/Imesh,
Sure..I will have a look on the logs. I will also go through the recent commits and try to
revert the fix which added for nested group scaling as it is not needed for this release.
 As Martin mentioned that after the fixes, there are more issues. Otherwise, we will have
to go through another major effort in testing it.
I will update the progress of it...

Thanks,
Reka

On Sun, Jun 21, 2015 at 8:14 AM, Imesh Gunaratne <imesh@apache.org<mailto:imesh@apache.org>>
wrote:
Hi Martin,

Thanks for the quick response. Yes we will definitely go through the logs and investigate
this.

Thanks

On Sun, Jun 21, 2015 at 8:09 AM, Martin Eppel (meppel) <meppel@cisco.com<mailto:meppel@cisco.com>>
wrote:
Hi Isuru,

No, the issue does not seem to be resolved. With the latest code I see issues in test cases
which used to work before  (beyond the latest example I posted the log files for - see below),
not sure yet what is going on.  I will be investigating further (making sure I am not mistaken)
and following up with some examples after the weekend but if you guys can take a look at the
log files on Monday I provided with the previous email that would be great,

Thanks

Martin

From: Imesh Gunaratne [mailto:imesh@apache.org<mailto:imesh@apache.org>]
Sent: Saturday, June 20, 2015 7:29 PM
To: dev
Cc: Lasindu Charith (lasindu@wso2.com<mailto:lasindu@wso2.com>); Reka Thirunavukkarasu
(reka@wso2.com<mailto:reka@wso2.com>); Ryan Du Plessis (rdupless)
Subject: Re: Testing Stratos 4.1: Application undeployment: application fails to undeploy
(nested grouping, group scaling)

Hi All,

I'm sorry I could not follow the entire discussion.
Can someone explain the latest status please? Have we resolved the initial group scaling issue
and now seeing an application removal problem?

Thanks

On Sat, Jun 20, 2015 at 2:06 AM, Martin Eppel (meppel) <meppel@cisco.com<mailto:meppel@cisco.com>>
wrote:
Hi Lasindu, Reka,


Just run into the issue with removing the application again: (with the fix for the issue included)

Please see [1a., 1b.] for the application structure (group scaling defined at only one group
level). See also the respective artifacts and log file attached.

Please advise if we should reopen the JIRA

Thanks

Martin


Application [1a.]

[cid:image002.png@01D0AD39.674B17A0]

[1b.] application after “starting application remove”

[cid:image003.png@01D0AD39.674B17A0]









--
Imesh Gunaratne

Senior Technical Lead, WSO2
Committer & PMC Member, Apache Stratos



--
Imesh Gunaratne

Senior Technical Lead, WSO2
Committer & PMC Member, Apache Stratos


--
Reka Thirunavukkarasu
Senior Software Engineer,
WSO2, Inc.:http://wso2.com,
Mobile: +94776442007<tel:%2B94776442007>




--
Imesh Gunaratne

Senior Technical Lead, WSO2
Committer & PMC Member, Apache Stratos



--
Reka Thirunavukkarasu
Senior Software Engineer,
WSO2, Inc.:http://wso2.com,
Mobile: +94776442007<tel:%2B94776442007>




--
Reka Thirunavukkarasu
Senior Software Engineer,
WSO2, Inc.:http://wso2.com,
Mobile: +94776442007<tel:%2B94776442007>




--
Reka Thirunavukkarasu
Senior Software Engineer,
WSO2, Inc.:http://wso2.com,
Mobile: +94776442007<tel:%2B94776442007>




--
Reka Thirunavukkarasu
Senior Software Engineer,
WSO2, Inc.:http://wso2.com,
Mobile: +94776442007

Mime
View raw message