stratos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reka Thirunavukkarasu <r...@wso2.com>
Subject Re: Testing Stratos 4.1: Application undeployment: application fails to undeploy (nested grouping, group scaling)
Date Mon, 22 Jun 2015 14:29:31 GMT
Hi Martin,

I have verified the fix by enabling read.write.lock.monitor.enabled=true.
The fix worked fine with it. Since we are using concurrency and delegated
some flow to Threads, i had to provide the thread values to below values in
the stratos.sh.

    -Dapplication.monitor.thread.pool.size=50 \
    -Dgroup.monitor.thread.pool.size=50 \

Please note that *it is recommended to have
read.write.lock.monitor.**enabled=false
as it will consume more footprint in the production*. This property
introduce only for the testing purpose.

We are in the process of analyzing the thread size and will come up with a
recommended values for it.

Also, i have fixed a small issue in the REST endpoint as it returns some
default value whenever application run time is not found. Now that if
runtime is not found, the below message will get populated.

{"status":"error","message":"Application runtime not found"}

I have also verified the undeployment with group scaling. Didn't find any
issues with the above fixes.

Please find the latest commit as below:

0a969200d11228158606f011ca7e5e795f336d92.

Please note that below error was only observed which is harmless for now. I
have verified it with a workaround and working fine. But will check on the
severity and decide on a proper fix or will go with the workaround.

[1]. TID: [0] [STRATOS] [2015-06-22 14:22:01,872] ERROR
{org.apache.stratos.common.concurrent.locks.ReadWriteLockMonitor} -  System
error, lock has not released for 30 seconds: [lock-name] topology
[lock-type] Write [thread-id] 117 [thread-name] pool-24-thread-2
[stack-trace]
java.lang.Thread.getStackTrace(Thread.java:1589)
org.apache.stratos.common.concurrent.locks.ReadWriteLock.acquireWriteLock(ReadWriteLock.java:123)
org.apache.stratos.messaging.message.processor.topology.updater.TopologyUpdater.acquireWriteLockForService(TopologyUpdater.java:123)
org.apache.stratos.messaging.message.processor.topology.ApplicationClustersCreatedMessageProcessor.doProcess(ApplicationClustersCreatedMessageProcessor.java:78)
org.apache.stratos.messaging.message.processor.topology.ApplicationClustersCreatedMessageProcessor.process(ApplicationClustersCreatedMessageProcessor.java:59)
org.apache.stratos.messaging.message.processor.topology.ServiceRemovedMessageProcessor.process(ServiceRemovedMessageProcessor.java:64)
org.apache.stratos.messaging.message.processor.topology.ServiceCreatedMessageProcessor.process(ServiceCreatedMessageProcessor.java:65)
org.apache.stratos.messaging.message.processor.topology.CompleteTopologyMessageProcessor.process(CompleteTopologyMessageProcessor.java:74)
org.apache.stratos.messaging.message.processor.MessageProcessorChain.process(MessageProcessorChain.java:61)
org.apache.stratos.messaging.message.receiver.topology.TopologyEventMessageDelegator.run(TopologyEventMessageDelegator.java:73)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:745)

Thanks,
Reka



On Mon, Jun 22, 2015 at 12:24 PM, Reka Thirunavukkarasu <reka@wso2.com>
wrote:

> Hi Martin,
>
> Found the reason why we didn't encounter these locking issue as we were
> testing with default stratos pack which has
> read.write.lock.monitor.enabled=false. The locking warning or issue is
> raised only when you use read.write.lock.monitor.enabled=true. That's why
> you were only facing these locking issue as you use this configuration in
> your setup.
>
> Since I'm able to reproduce the issue, i will test with the fix that i
> already pushed and update the thread.
>
> We will discuss and try to make this read.write.lock.monitor.enabled=true
> by default with stratos. So that we can find issues as early and fix them.
>
> Thanks,
> Reka
>
> On Mon, Jun 22, 2015 at 12:16 AM, Reka Thirunavukkarasu <reka@wso2.com>
> wrote:
>
>> Sorry Martin..I have only locally fixed the issue. I have pushed it now
>> only. Can you test with 1c21daaeea7b27ad0a0141a70b32e9443e78e309 when you
>> get chance? I will also continue testing with this fix.
>>
>> Thanks,
>> Reka
>>
>> On Mon, Jun 22, 2015 at 12:07 AM, Martin Eppel (meppel) <meppel@cisco.com
>> > wrote:
>>
>>>  Btw,
>>>
>>>
>>>
>>> This is my last commit I picked up from the stratos master:
>>>
>>>
>>>
>>> commit 58bea52be814269416f70391fef50859aa5ae0a1
>>>
>>> Author: lasinducharith <lasinducharith@gmail.com>
>>>
>>> Date:   Fri Jun 19 19:40:27 2015 +0530
>>>
>>>
>>>
>>> *From:* Martin Eppel (meppel)
>>> *Sent:* Sunday, June 21, 2015 10:28 AM
>>> *To:* dev@stratos.apache.org; Reka Thirunavukkarasu
>>> *Cc:* Lasindu Charith (lasindu@wso2.com); Ryan Du Plessis (rdupless)
>>> *Subject:* RE: Testing Stratos 4.1: Application undeployment:
>>> application fails to undeploy (nested grouping, group scaling)
>>>
>>>
>>>
>>> Hi Reka,
>>>
>>>
>>>
>>> Here is *anothe*r example which fails, see application at [1.],
>>> attached log files and jsons.  I run a few scenarios, the one which is
>>> failing is application with the name “s-g-c1-c2-c3” (last scenario). All
>>> members get removed but application remains deployed,
>>>
>>> “s-g-c1-c2-c3: applicationInstances 0, groupInstances 0,
>>> clusterInstances 0, members 0 ()”
>>>
>>>
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>>
>>>
>>> Martin
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> *From:* Imesh Gunaratne [mailto:imesh@apache.org <imesh@apache.org>]
>>> *Sent:* Sunday, June 21, 2015 1:32 AM
>>> *To:* Reka Thirunavukkarasu
>>> *Cc:* dev; Lasindu Charith (lasindu@wso2.com); Ryan Du Plessis
>>> (rdupless)
>>> *Subject:* Re: Testing Stratos 4.1: Application undeployment:
>>> application fails to undeploy (nested grouping, group scaling)
>>>
>>>
>>>
>>> Great! Thanks Reka!
>>>
>>>
>>>
>>> On Sun, Jun 21, 2015 at 8:34 AM, Reka Thirunavukkarasu <reka@wso2.com>
>>> wrote:
>>>
>>> Hi Martin/Imesh,
>>>
>>> Sure..I will have a look on the logs. I will also go through the recent
>>> commits and try to revert the fix which added for nested group scaling as
>>> it is not needed for this release.  As Martin mentioned that after the
>>> fixes, there are more issues. Otherwise, we will have to go through another
>>> major effort in testing it.
>>>
>>> I will update the progress of it...
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Reka
>>>
>>>
>>>
>>> On Sun, Jun 21, 2015 at 8:14 AM, Imesh Gunaratne <imesh@apache.org>
>>> wrote:
>>>
>>> Hi Martin,
>>>
>>>
>>>
>>> Thanks for the quick response. Yes we will definitely go through the
>>> logs and investigate this.
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Sun, Jun 21, 2015 at 8:09 AM, Martin Eppel (meppel) <meppel@cisco.com>
>>> wrote:
>>>
>>> Hi Isuru,
>>>
>>>
>>>
>>> No, the issue does not seem to be resolved. With the latest code I see
>>> issues in test cases which used to work before  (beyond the latest example
>>> I posted the log files for - see below), not sure yet what is going on.  I
>>> will be investigating further (making sure I am not mistaken) and following
>>> up with some examples after the weekend but if you guys can take a look at
>>> the log files on Monday I provided with the previous email that would be
>>> great,
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Martin
>>>
>>>
>>>
>>> *From:* Imesh Gunaratne [mailto:imesh@apache.org]
>>> *Sent:* Saturday, June 20, 2015 7:29 PM
>>> *To:* dev
>>> *Cc:* Lasindu Charith (lasindu@wso2.com); Reka Thirunavukkarasu (
>>> reka@wso2.com); Ryan Du Plessis (rdupless)
>>> *Subject:* Re: Testing Stratos 4.1: Application undeployment:
>>> application fails to undeploy (nested grouping, group scaling)
>>>
>>>
>>>
>>> Hi All,
>>>
>>>
>>>
>>> I'm sorry I could not follow the entire discussion.
>>>
>>> Can someone explain the latest status please? Have we resolved the
>>> initial group scaling issue and now seeing an application removal problem?
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> On Sat, Jun 20, 2015 at 2:06 AM, Martin Eppel (meppel) <meppel@cisco.com>
>>> wrote:
>>>
>>> Hi Lasindu, Reka,
>>>
>>>
>>>
>>>
>>>
>>> Just run into the issue with removing the application *again*: (with
>>> the fix for the issue included)
>>>
>>>
>>>
>>> Please see [1a., 1b.] for the application structure (group scaling
>>> defined at only one group level). See also the respective artifacts and log
>>> file attached.
>>>
>>>
>>>
>>> Please advise if we should reopen the JIRA
>>>
>>>
>>>
>>> Thanks
>>>
>>>
>>>
>>> Martin
>>>
>>>
>>>
>>>
>>>
>>> Application [1a.]
>>>
>>>
>>>
>>>
>>>
>>> [1b.] application after “starting application remove”
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Imesh Gunaratne
>>>
>>>
>>>
>>> Senior Technical Lead, WSO2
>>>
>>> Committer & PMC Member, Apache Stratos
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Imesh Gunaratne
>>>
>>>
>>>
>>> Senior Technical Lead, WSO2
>>>
>>> Committer & PMC Member, Apache Stratos
>>>
>>>
>>>
>>>   --
>>>
>>> Reka Thirunavukkarasu
>>> Senior Software Engineer,
>>> WSO2, Inc.:http://wso2.com,
>>>
>>> Mobile: +94776442007
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Imesh Gunaratne
>>>
>>>
>>>
>>> Senior Technical Lead, WSO2
>>>
>>> Committer & PMC Member, Apache Stratos
>>>
>>
>>
>>
>> --
>> Reka Thirunavukkarasu
>> Senior Software Engineer,
>> WSO2, Inc.:http://wso2.com,
>> Mobile: +94776442007
>>
>>
>>
>
>
> --
> Reka Thirunavukkarasu
> Senior Software Engineer,
> WSO2, Inc.:http://wso2.com,
> Mobile: +94776442007
>
>
>


-- 
Reka Thirunavukkarasu
Senior Software Engineer,
WSO2, Inc.:http://wso2.com,
Mobile: +94776442007

Mime
View raw message