openejb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Quintin Beukes <quin...@skywalk.co.za>
Subject Re: Connection Recovery
Date Tue, 24 Nov 2009 08:48:39 GMT
:> Original problem. I'm just trying to understand why it would take
so long even after newly started apps are working fine.

The "cross over server" sounds like a very good idea, though I would
still like to know how this works.

Quintin Beukes



On Tue, Nov 24, 2009 at 10:36 AM, David Blevins <david.blevins@visi.com> wrote:
>
> On Nov 24, 2009, at 12:14 AM, Quintin Beukes wrote:
>
>> But the problem is that after I redeployed the requests fail for up to
>> 6 minutes there after if the app was running prior even while newly
>> booted clients are working fine? So I would do this:
>>
>> 1. Run clientA
>> 2. Undeploy, deploy app - deploy completed successfully
>> 3. ClientA fails
>> 4. Run ClientB
>> 5. ClientB succeeds
>> 6. ClientA fails stills
>> 7. 5-6 minutes later clientA starts succeeding.
>>
>> Why is this? After the 5 minute period, what triggers the recovery to
>> start?
>
> Just so I make sure I'm on the same page, the above is a recap of the
> original problem?  Or are you now seeing 5-6 minute delays in the failover
> logic from one server to the next?
>
> -David
>
>> On Mon, Nov 23, 2009 at 10:32 PM, David Blevins <david.blevins@visi.com>
>> wrote:
>>>
>>> This is because there is no concept of blocking connections for
>>> applications
>>> that are being redeployed.  In fact our (geronimo and openejb) redeploy
>>> is
>>> just an undeploy followed by a deploy, so there's no special concept
>>> propagated throughout the architecture that something is only temporarily
>>> gone and will be coming back.
>>>
>>> That said, if we added all that, you'd still have to wait the exact same
>>> amount of time for the undeploy/deploy process plus a small shave off of
>>> total throughput for the synchronization we'd have to add to make the
>>> wait-for-redeploy logic work.  The bigger your app gets, the longer the
>>> time
>>> your client spends waiting, which is probably going to drive you to
>>> splinter
>>> your app into tiny fragments which only complicates things more and adds
>>> even more overhead.
>>>
>>> This is one of those "change the problem" types of situations.  Instead
>>> enable multicast discovery on your server and just let it run.  When you
>>> want to "redeploy", just boot a new server with the new version of the
>>> app.
>>>  When you get that server running to your liking -- almost everyone has
>>> some
>>> sort of initialization they like to do in their apps -- just shut down
>>> the
>>> first server and the client will immediately roll over to the second
>>> server.
>>>  No time spent waiting on redeploy and you actually get a chance to pound
>>> on
>>> your new server a bit before it goes live, so no more risk that the app
>>> doesn't quite work after the redeploy.
>>>
>>> You can do this all on the same machine with a shell script or ant script
>>> that is only slightly fancier than your standard start, deploy, run type
>>> of
>>> script.
>>>
>>> -David
>>>
>>> On Nov 23, 2009, at 6:20 AM, Quintin Beukes wrote:
>>>
>>>> Hey,
>>>>
>>>> One of the big motivations for shifting to OpenEJB 3.1.2 from GF was
>>>> the connection recovery built into it. My inquiry was specifically
>>>> about failed remote EJB connections because a server restarts or a
>>>> network hiccup. We've always had problems with this, especially with
>>>> networks running from surface to underground.
>>>>
>>>> This seems to be working. Great job!! Thanks alot. I'm aiming for
>>>> complete recovery in the following situations.
>>>> 1. System exception (any failure on the part of OpenEJB, even if the
>>>> cause was my own)
>>>> 2. Dropped connections due to network
>>>> 3. Dropped connections from server restart
>>>> 4. Dropped connections from redeploy (basically the same as (3) I
>>>> assume).
>>>>
>>>> Though I have a query though. Every time I redeploy, for ex, I have to
>>>> manually go and restart all services because it takes very long for
>>>> OpenEJB client to notice something is wrong and recover.
>>>>
>>>> Here is an example: If I undeploy, I get something like the following
>>>> on a client.
>>>>
>>>> --------- SNIP -----------
>>>>
>>>> javax.ejb.EJBException: Container has suffered a SystemException
>>>>
>>>>
>>>> org.apache.openejb.client.EJBObjectHandler._invoke(EJBObjectHandler.java:178)
>>>>
>>>>
>>>> org.apache.openejb.client.EJBInvocationHandler.invoke(EJBInvocationHandler.java:117)
>>>>
>>>>
>>>> org.apache.openejb.client.proxy.Jdk13InvocationHandler.invoke(Jdk13InvocationHandler.java:52)
>>>>  $Proxy2.captureLamp(Unknown Source)
>>>>
>>>>
>>>> net.kunye.vds.server.services.configuration.ConfigurationService.service(ConfigurationService.java:73)
>>>>  net.kunye.services.tcp.TCPServiceThread.run(TCPServiceThread.java:86)
>>>>
>>>> java.rmi.RemoteException: The server has encountered a fatal error: No
>>>> such deployment java.rmi.RemoteException: No deployment:
>>>> VDS-lamps-ejb-3.0.jar/LampCaptureBean; nested exception is:
>>>>       java.rmi.RemoteException: No deployment:
>>>> VDS-lamps-ejb-3.0.jar/LampCaptureBean
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbRequestHandler.replyWithFatalError(EjbRequestHandler.java:425)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbRequestHandler.processRequest(EjbRequestHandler.java:81)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbDaemon.processEjbRequest(EjbDaemon.java:196)
>>>>  org.apache.openejb.server.ejbd.EjbDaemon.service(EjbDaemon.java:149)
>>>>  org.apache.openejb.server.ejbd.EjbServer.service(EjbServer.java:71)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.KeepAliveServer$Session.service(KeepAliveServer.java:213)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.KeepAliveServer.service(KeepAliveServer.java:233)
>>>>  org.apache.openejb.server.ejbd.EjbServer.service(EjbServer.java:66)
>>>>  org.apache.openejb.server.ServicePool$2.run(ServicePool.java:91)
>>>>  org.apache.openejb.server.ServicePool$3.run(ServicePool.java:120)
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>  java.lang.Thread.run(Thread.java:619)
>>>>
>>>> java.rmi.RemoteException: No deployment:
>>>> VDS-lamps-ejb-3.0.jar/LampCaptureBean
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbDaemon.getDeployment(EjbDaemon.java:191)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbRequestHandler.processRequest(EjbRequestHandler.java:79)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbDaemon.processEjbRequest(EjbDaemon.java:196)
>>>>  org.apache.openejb.server.ejbd.EjbDaemon.service(EjbDaemon.java:149)
>>>>  org.apache.openejb.server.ejbd.EjbServer.service(EjbServer.java:71)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.KeepAliveServer$Session.service(KeepAliveServer.java:213)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.KeepAliveServer.service(KeepAliveServer.java:233)
>>>>  org.apache.openejb.server.ejbd.EjbServer.service(EjbServer.java:66)
>>>>  org.apache.openejb.server.ServicePool$2.run(ServicePool.java:91)
>>>>  org.apache.openejb.server.ServicePool$3.run(ServicePool.java:120)
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>  java.lang.Thread.run(Thread.java:619)
>>>>
>>>>
>>>> 23 Nov 2009 15:59:20,039 ERROR -- Exception: Service
>>>> ConfigurationService failed for instance
>>>> ConfigurationService[/10.0.0.25] (java.lang.Exception)
>>>> java.lang.Exception: ConfigurationService has an unknown failure while
>>>> servicing a request.
>>>>
>>>>
>>>> net.kunye.vds.server.services.configuration.ConfigurationService.service(ConfigurationService.java:84)
>>>>  net.kunye.services.tcp.TCPServiceThread.run(TCPServiceThread.java:86)
>>>>
>>>> javax.ejb.EJBException: Container has suffered a SystemException
>>>>
>>>>
>>>> org.apache.openejb.client.EJBObjectHandler._invoke(EJBObjectHandler.java:178)
>>>>
>>>>
>>>> org.apache.openejb.client.EJBInvocationHandler.invoke(EJBInvocationHandler.java:117)
>>>>
>>>>
>>>> org.apache.openejb.client.proxy.Jdk13InvocationHandler.invoke(Jdk13InvocationHandler.java:52)
>>>>  $Proxy2.captureLamp(Unknown Source)
>>>>
>>>>
>>>> net.kunye.vds.server.services.configuration.ConfigurationService.service(ConfigurationService.java:73)
>>>>  net.kunye.services.tcp.TCPServiceThread.run(TCPServiceThread.java:86)
>>>>
>>>> java.rmi.RemoteException: The server has encountered a fatal error: No
>>>> such deployment java.rmi.RemoteException: No deployment:
>>>> VDS-lamps-ejb-3.0.jar/LampCaptureBean; nested exception is:
>>>>       java.rmi.RemoteException: No deployment:
>>>> VDS-lamps-ejb-3.0.jar/LampCaptureBean
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbRequestHandler.replyWithFatalError(EjbRequestHandler.java:425)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbRequestHandler.processRequest(EjbRequestHandler.java:81)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbDaemon.processEjbRequest(EjbDaemon.java:196)
>>>>  org.apache.openejb.server.ejbd.EjbDaemon.service(EjbDaemon.java:149)
>>>>  org.apache.openejb.server.ejbd.EjbServer.service(EjbServer.java:71)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.KeepAliveServer$Session.service(KeepAliveServer.java:213)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.KeepAliveServer.service(KeepAliveServer.java:233)
>>>>  org.apache.openejb.server.ejbd.EjbServer.service(EjbServer.java:66)
>>>>  org.apache.openejb.server.ServicePool$2.run(ServicePool.java:91)
>>>>  org.apache.openejb.server.ServicePool$3.run(ServicePool.java:120)
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>  java.lang.Thread.run(Thread.java:619)
>>>>
>>>> java.rmi.RemoteException: No deployment:
>>>> VDS-lamps-ejb-3.0.jar/LampCaptureBean
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbDaemon.getDeployment(EjbDaemon.java:191)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbRequestHandler.processRequest(EjbRequestHandler.java:79)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.EjbDaemon.processEjbRequest(EjbDaemon.java:196)
>>>>  org.apache.openejb.server.ejbd.EjbDaemon.service(EjbDaemon.java:149)
>>>>  org.apache.openejb.server.ejbd.EjbServer.service(EjbServer.java:71)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.KeepAliveServer$Session.service(KeepAliveServer.java:213)
>>>>
>>>>
>>>> org.apache.openejb.server.ejbd.KeepAliveServer.service(KeepAliveServer.java:233)
>>>>  org.apache.openejb.server.ejbd.EjbServer.service(EjbServer.java:66)
>>>>  org.apache.openejb.server.ServicePool$2.run(ServicePool.java:91)
>>>>  org.apache.openejb.server.ServicePool$3.run(ServicePool.java:120)
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>>
>>>>
>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>  java.lang.Thread.run(Thread.java:619)
>>>>
>>>> --------- SNIP -----------
>>>>
>>>> Then, for a while, even after the EJB service started up completely
>>>> and even though I'm creating new InitialContext instances I get the
>>>> following. This isn't too bad, though it does take a few minutes
>>>> (between 3 and 5) to recover.
>>>>
>>>> --------- SNIP -----------
>>>>
>>>> 23 Nov 2009 16:00:24,802 TRACE -- Received packet
>>>> [R0010088220000000000000000]
>>>> 23 Nov 2009 16:00:24,804 ERROR -- Exception: Unknown failure servicing
>>>> the request (javax.ejb.EJBException)
>>>> javax.ejb.EJBException: Unknown Container Exception:
>>>> java.rmi.RemoteException: Received invalid response code from server:
>>>> -1
>>>>
>>>>
>>>> org.apache.openejb.client.EJBObjectHandler._invoke(EJBObjectHandler.java:184)
>>>>
>>>>
>>>> org.apache.openejb.client.EJBInvocationHandler.invoke(EJBInvocationHandler.java:117)
>>>>
>>>>
>>>> org.apache.openejb.client.proxy.Jdk13InvocationHandler.invoke(Jdk13InvocationHandler.java:52)
>>>>  $Proxy2.captureLamp(Unknown Source)
>>>>
>>>>
>>>> net.kunye.vds.server.services.configuration.ConfigurationService.service(ConfigurationService.java:73)
>>>>  net.kunye.services.tcp.TCPServiceThread.run(TCPServiceThread.java:86)
>>>>
>>>> java.rmi.RemoteException: Received invalid response code from server: -1
>>>>
>>>>
>>>> org.apache.openejb.client.EJBObjectHandler.businessMethod(EJBObjectHandler.java:239)
>>>>
>>>>
>>>> org.apache.openejb.client.EJBObjectHandler._invoke(EJBObjectHandler.java:157)
>>>>
>>>>
>>>> org.apache.openejb.client.EJBInvocationHandler.invoke(EJBInvocationHandler.java:117)
>>>>
>>>>
>>>> org.apache.openejb.client.proxy.Jdk13InvocationHandler.invoke(Jdk13InvocationHandler.java:52)
>>>>  $Proxy2.captureLamp(Unknown Source)
>>>>
>>>>
>>>> net.kunye.vds.server.services.configuration.ConfigurationService.service(ConfigurationService.java:73)
>>>>  net.kunye.services.tcp.TCPServiceThread.run(TCPServiceThread.java:86)
>>>>
>>>>
>>>> 23 Nov 2009 16:00:24,804 ERROR -- Exception: Service
>>>> ConfigurationService failed for instance
>>>> ConfigurationService[/10.0.0.25] (java.lang.Exception)
>>>> java.lang.Exception: ConfigurationService has an unknown failure while
>>>> servicing a request.
>>>>
>>>>
>>>> net.kunye.vds.server.services.configuration.ConfigurationService.service(ConfigurationService.java:84)
>>>>  net.kunye.services.tcp.TCPServiceThread.run(TCPServiceThread.java:86)
>>>>
>>>> javax.ejb.EJBException: Unknown Container Exception:
>>>> java.rmi.RemoteException: Received invalid response code from server:
>>>> -1
>>>>
>>>>
>>>> org.apache.openejb.client.EJBObjectHandler._invoke(EJBObjectHandler.java:184)
>>>>
>>>>
>>>> org.apache.openejb.client.EJBInvocationHandler.invoke(EJBInvocationHandler.java:117)
>>>>
>>>>
>>>> org.apache.openejb.client.proxy.Jdk13InvocationHandler.invoke(Jdk13InvocationHandler.java:52)
>>>>  $Proxy2.captureLamp(Unknown Source)
>>>>
>>>>
>>>> net.kunye.vds.server.services.configuration.ConfigurationService.service(ConfigurationService.java:73)
>>>>  net.kunye.services.tcp.TCPServiceThread.run(TCPServiceThread.java:86)
>>>>
>>>> java.rmi.RemoteException: Received invalid response code from server: -1
>>>>
>>>>
>>>> org.apache.openejb.client.EJBObjectHandler.businessMethod(EJBObjectHandler.java:239)
>>>>
>>>>
>>>> org.apache.openejb.client.EJBObjectHandler._invoke(EJBObjectHandler.java:157)
>>>>
>>>>
>>>> org.apache.openejb.client.EJBInvocationHandler.invoke(EJBInvocationHandler.java:117)
>>>>
>>>>
>>>> org.apache.openejb.client.proxy.Jdk13InvocationHandler.invoke(Jdk13InvocationHandler.java:52)
>>>>  $Proxy2.captureLamp(Unknown Source)
>>>>
>>>>
>>>> net.kunye.vds.server.services.configuration.ConfigurationService.service(ConfigurationService.java:73)
>>>>  net.kunye.services.tcp.TCPServiceThread.run(TCPServiceThread.java:86)
>>>>
>>>> --------- SNIP -----------
>>>>
>>>> After it recovers from the above, all is back to normal.
>>>>
>>>> This is quite serious. How I can get around this, even if I have to
>>>> make some code changes?
>>>>
>>>> Thanks again. I'm so relieved this problem is finally over. Previously
>>>> I've had to do some serious recovery just because a client was left
>>>> running and Glassfish's client library opened so many file descriptors
>>>> for each failure that the server crashed with a No Space Left on
>>>> Device error.
>>>>
>>>> Quintin Beukes
>>>>
>>>
>>>
>>
>
>

Mime
View raw message