geronimo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jack Cai <greensi...@gmail.com>
Subject Re: 2.2 in production
Date Thu, 04 Mar 2010 15:50:38 GMT
+1

-Jack

On Thu, Mar 4, 2010 at 4:12 PM, Shawn Jiang <genspring@gmail.com> wrote:

> The WADI is still 2.1.1 in both 22 branch and trunk somehow.  If no
> objection, I'll upgrade it to 2.1.2.
>
>
> On Sat, Nov 14, 2009 at 2:59 PM, Gianny Damour <
> gianny.damour@optusnet.com.au> wrote:
>
>> Hi Trygve,
>>
>> Thanks for your test.
>>
>> These warnings indicate that a session replication message was not
>> acknowledged within a 2000ms timeframe. The message has not been lost and
>> will be processed by back-up nodes as expected.
>>
>>
>> Kevan, I just deployed release 2.1.2 of WADI for geronimo 2.2.
>>
>>
>> Thanks,
>> Gianny
>>
>>
>>
>> On 09/11/2009, at 11:38 PM, Trygve Hardersen wrote:
>>
>>  Hello again
>>>
>>> The system has now been running for a few days and I've not seen the hang
>>> on startup so I believe the issue has been resolved. Great!
>>>
>>> I am still seeing some warnings like this:
>>>
>>> 10:58:49,250 WARN  [UpdateReplicationCommand] Update has not been
>>> properly cascaded due to a communication failure. If a targeted node has
>>> been lost, state will be re-balanced automatically.
>>> org.codehaus.wadi.servicespace.ServiceInvocationException:
>>> org.codehaus.wadi.group.MessageExchangeException: No correlated messages
>>> received within [2000]ms
>>>        at
>>> org.codehaus.wadi.servicespace.basic.CGLIBServiceProxyFactory$ProxyMethodInterceptor.intercept(CGLIBServiceProxyFactory.java:209)
>>>        at
>>> org.codehaus.wadi.replication.storage.ReplicaStorage$$EnhancerByCGLIB$$4372d264.mergeUpdate(<generated>)
>>>        at
>>> org.codehaus.wadi.replication.manager.basic.UpdateReplicationCommand.cascadeUpdate(UpdateReplicationCommand.java:93)
>>>        at
>>> org.codehaus.wadi.replication.manager.basic.UpdateReplicationCommand.run(UpdateReplicationCommand.java:86)
>>>        at
>>> org.codehaus.wadi.replication.manager.basic.SyncReplicationManager.update(SyncReplicationManager.java:138)
>>>        at
>>> org.codehaus.wadi.replication.manager.basic.LoggingReplicationManager.update(LoggingReplicationManager.java:100)
>>>        at
>>> org.codehaus.wadi.core.session.AbstractReplicableSession.onEndProcessing(AbstractReplicableSession.java:49)
>>>        at
>>> org.codehaus.wadi.core.session.AtomicallyReplicableSession.onEndProcessing(AtomicallyReplicableSession.java:58)
>>>        at
>>> org.apache.geronimo.clustering.wadi.WADISessionAdaptor.onEndAccess(WADISessionAdaptor.java:77)
>>>        at
>>> org.apache.geronimo.jetty6.cluster.ClusteredSessionManager.complete(ClusteredSessionManager.java:60)
>>>        at
>>> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:198)
>>>        at
>>> org.apache.geronimo.jetty6.cluster.ClusteredSessionHandler.doHandle(ClusteredSessionHandler.java:59)
>>>        at
>>> org.apache.geronimo.jetty6.cluster.ClusteredSessionHandler$ActualHandler.handle(ClusteredSessionHandler.java:66)
>>>        at
>>> org.apache.geronimo.jetty6.cluster.AbstractClusteredPreHandler$WebClusteredInvocation.invokeLocally(AbstractClusteredPreHandler.java:71)
>>>        at
>>> org.apache.geronimo.jetty6.cluster.wadi.WADIClusteredPreHandler$WADIWebClusteredInvocation.access$000(WADIClusteredPreHandler.java:52)
>>>        at
>>> org.apache.geronimo.jetty6.cluster.wadi.WADIClusteredPreHandler$WADIWebClusteredInvocation$1.doFilter(WADIClusteredPreHandler.java:64)
>>>        at
>>> org.codehaus.wadi.web.impl.WebInvocation.invoke(WebInvocation.java:116)
>>>        at
>>> org.codehaus.wadi.core.contextualiser.MemoryContextualiser.handleLocally(MemoryContextualiser.java:71)
>>>        at
>>> org.codehaus.wadi.core.contextualiser.AbstractExclusiveContextualiser.handle(AbstractExclusiveContextualiser.java:94)
>>>        at
>>> org.codehaus.wadi.core.contextualiser.AbstractMotingContextualiser.contextualise(AbstractMotingContextualiser.java:37)
>>>        at
>>> org.codehaus.wadi.core.manager.StandardManager.processStateful(StandardManager.java:154)
>>>        at
>>> org.codehaus.wadi.core.manager.StandardManager.contextualise(StandardManager.java:146)
>>>        at
>>> org.codehaus.wadi.core.manager.ClusteredManager.contextualise(ClusteredManager.java:81)
>>>        at
>>> org.apache.geronimo.jetty6.cluster.wadi.WADIClusteredPreHandler$WADIWebClusteredInvocation.invoke(WADIClusteredPreHandler.java:72)
>>>        at
>>> org.apache.geronimo.jetty6.cluster.AbstractClusteredPreHandler.handle(AbstractClusteredPreHandler.java:39)
>>>        at
>>> org.apache.geronimo.jetty6.cluster.ClusteredSessionHandler.handle(ClusteredSessionHandler.java:51)
>>>        at
>>> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765)
>>>        at
>>> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
>>>        at
>>> org.apache.geronimo.jetty6.handler.TwistyWebAppContext.access$101(TwistyWebAppContext.java:41)
>>>        at
>>> org.apache.geronimo.jetty6.handler.TwistyWebAppContext$TwistyHandler.handle(TwistyWebAppContext.java:66)
>>>        at
>>> org.apache.geronimo.jetty6.handler.ThreadClassloaderHandler.handle(ThreadClassloaderHandler.java:46)
>>>        at
>>> org.apache.geronimo.jetty6.handler.InstanceContextHandler.handle(InstanceContextHandler.java:58)
>>>        at
>>> org.apache.geronimo.jetty6.handler.UserTransactionHandler.handle(UserTransactionHandler.java:48)
>>>        at
>>> org.apache.geronimo.jetty6.handler.ComponentContextHandler.handle(ComponentContextHandler.java:47)
>>>        at
>>> org.apache.geronimo.jetty6.handler.TwistyWebAppContext.handle(TwistyWebAppContext.java:60)
>>>        at
>>> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>>>        at
>>> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>>>        at
>>> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>>>        at org.mortbay.jetty.Server.handle(Server.java:326)
>>>        at
>>> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
>>>        at
>>> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:879)
>>>        at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:747)
>>>        at
>>> org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218)
>>>        at
>>> org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404)
>>>        at
>>> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
>>>        at org.apache.geronimo.pool.ThreadPool$1.run(ThreadPool.java:214)
>>>        at
>>> org.apache.geronimo.pool.ThreadPool$ContextClassLoaderRunnable.run(ThreadPool.java:344)
>>>        at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>        at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>        at java.lang.Thread.run(Thread.java:619)
>>> Caused by: org.codehaus.wadi.group.MessageExchangeException: No
>>> correlated messages received within [2000]ms
>>>        at
>>> org.codehaus.wadi.group.impl.AbstractDispatcher.attemptMultiRendezVous(AbstractDispatcher.java:174)
>>>        at
>>> org.codehaus.wadi.servicespace.basic.BasicServiceInvoker.invokeOnPeers(BasicServiceInvoker.java:90)
>>>        at
>>> org.codehaus.wadi.servicespace.basic.BasicServiceInvoker.invoke(BasicServiceInvoker.java:69)
>>>        at
>>> org.codehaus.wadi.servicespace.basic.CGLIBServiceProxyFactory$ProxyMethodInterceptor.intercept(CGLIBServiceProxyFactory.java:193)
>>>        ... 49 more
>>>
>>> Not sure if I should worry about that.
>>>
>>> Many thanks for your help!
>>>
>>> Trygve
>>>
>>>
>>> On Tue, Nov 3, 2009 at 4:35 PM, Trygve Hardersen <trygve@jotta.no>
>>> wrote:
>>>
>>>
>>> On Mon, Nov 2, 2009 at 5:39 PM, Kevan Miller <kevan.miller@gmail.com>
>>> wrote:
>>>
>>> Thanks Gianny. I'd like to see this included in the Geronimo 2.2 release.
>>> Can we look for a new WADI release, soon? Once we know the problem is fixed?
>>>
>>> Trygve, the sooner we get confirmation that your issue is resolved, the
>>> sooner we can start finalizing the 2.2 release.
>>>
>>> --kevan
>>>
>>> I've now built Geronimo using the 2.2-SNAPSHOT of WADI and installed it
>>> on our test environment. No obvious issues so I'll go ahead and deploy this
>>> to production either later this evening (EU time) or tomorrow. Then it needs
>>> to run for a few days before I can confirm if the issue has really been
>>> resolved.
>>>
>>> BTW I got this on our test system:
>>>
>>> AS-000:
>>> 16:23:17,773 INFO  [TcpFailureDetector] Received
>>> memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://{-64,
>>> -88, 1, 61}:4000,{-64, -88, 1, 61},4000, alive=1814258,id={50 18 86 10 111
>>> -47 79 83 -108 -4 82 -8 26 82 -79 -59 }, payload={-84 -19 0 5 115 114 0 50
>>> 111 ...(423)}, command={}, domain={74 79 84 84 65 95 87 65 68 ...(10)}, ]]
>>> message. Will verify.
>>> 16:23:17,897 INFO  [TcpFailureDetector] Verification complete. Member
>>> still alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://{-64,
>>> -88, 1, 61}:4000,{-64, -88, 1, 61},4000, alive=1814258,id={50 18 86 10 111
>>> -47 79 83 -108 -4 82 -8 26 82 -79 -59 }, payload={-84 -19 0 5 115 114 0 50
>>> 111 ...(423)}, command={}, domain={74 79 84 84 65 95 87 65 68 ...(10)}, ]]
>>>
>>> AS-001:
>>> 16:23:18,446 INFO  [TcpFailureDetector] Received
>>> memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://{-64,
>>> -88, 1, 60}:4000,{-64, -88, 1, 60},4000, alive=2500759,id={107 -64 91 -23
>>> 109 93 75 116 -95 109 110 22 -85 53 -52 85 }, payload={-84 -19 0 5 115 114 0
>>> 50 111 ...(423)}, command={}, domain={74 79 84 84 65 95 87 65 68 ...(10)},
>>> ]] message. Will verify.
>>> 16:23:18,456 INFO  [TcpFailureDetector] Verification complete. Member
>>> still alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://{-64,
>>> -88, 1, 60}:4000,{-64, -88, 1, 60},4000, alive=2500759,id={107 -64 91 -23
>>> 109 93 75 116 -95 109 110 22 -85 53 -52 85 }, payload={-84 -19 0 5 115 114 0
>>> 50 111 ...(423)}, command={}, domain={74 79 84 84 65 95 87 65 68 ...(10)},
>>> ]]
>>>
>>> And then:
>>>
>>> AS-000
>>> 16:30:02,576 INFO  [ChannelInterceptorBase] memberDisappeared:tcp://{-64,
>>> -88, 1, 61}:4000
>>> 16:30:02,577 INFO  [BasicPartitionBalancerSingletonService] Queueing
>>> partition rebalancing
>>> 16:30:02,600 INFO  [SimpleStateManager]
>>> =============================
>>> New Partition Balancing
>>> Partition Balancing
>>>    Size [24]
>>>    Partition[0] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[1] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[2] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[3] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[4] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[5] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[6] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[7] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[8] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[9] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[10] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[11] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[12] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[13] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[14] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[15] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[16] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[17] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[18] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[19] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[20] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[21] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[22] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>>    Partition[23] owned by [TribesPeer [AS-000; tcp://192.168.1.60:4000]];
>>> version [3]; mergeVersion [0]
>>> =============================
>>>
>>> 16:30:02,888 WARN  [TcpFailureDetector] Member added, even though we
>>> werent notified:org.apache.catalina.tribes.membership.MemberImpl[tcp://{-64,
>>> -88, 1, 61}:4000,{-64, -88, 1, 61},4000, alive=2221072,id={50 18 86 10 111
>>> -47 79 83 -108 -4 82 -8 26 82 -79 -59 }, payload={-84 -19 0 5 115 114 0 50
>>> 111 ...(423)}, command={}, domain={74 79 84 84 65 95 87 65 68 ...(10)}, ]
>>> 16:30:02,889 INFO  [ChannelInterceptorBase] memberAdded:tcp://{-64, -88,
>>> 1, 61}:4000
>>>
>>> AS-001
>>> Nothing....
>>>
>>> There is practically no load on this network. Anyway I'll try this with
>>> load and see what happens.
>>>
>>> Many thanks again!
>>>
>>> Trygve
>>>
>>>
>>
>
>
> --
> Shawn
>

Mime
View raw message