river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike McGrady <michael.mcgr...@gmail.com>
Subject Re: client hang in com.sun.jini.jeri.internal.mux.Mux.start()
Date Fri, 29 Apr 2011 21:38:17 GMT
Throwing my two cents in here just to state my opinion.  This is an effort that could pay dividends
if it were done with a view toward the future - absolutely no pun intended.  I do not know
the details of the code but if futures could be useful here, they would be welcomed by myself.

MG


On Apr 29, 2011, at 1:33 PM, Christopher Dolan wrote:

> Thanks, Tom.
> 
> I don't really understand your Future suggestion.  Are you suggesting
> changing the async handshake to a Future? If so, that sounds like a very
> involved change, touching a lot of code in Mux and its subclasses.
> 
> setDown() changes the value of the muxDown boolean to true, so it's a
> valid way out of the loop.
> 
> Chris
> 
> -----Original Message-----
> From: Tom Hobbs [mailto:tvhobbs@googlemail.com] 
> Sent: Friday, April 29, 2011 2:27 PM
> To: dev@river.apache.org
> Subject: Re: client hang in com.sun.jini.jeri.internal.mux.Mux.start()
> 
> The proposed code looks fine to me.  Two points leap out, more
> discussion
> points than anything else.
> 
> For some reason I've recently developed an aversion to writing my on
> timeout
> logic.  did you consider using something like a Future here or might
> that be
> serious overkill (it wouldn't surprise me if it was)?
> 
> Also is Setdown intended to break out of the while loop?  Because I
> can't
> see a way to escape it.  (I don't have the rest of the code in front of
> me)
> 
> Thanks for keep raising these issues - especially because you usually
> supply
> fixes for them!
> 
> Tom
> 
> On 29 Apr 2011 19:41, "Christopher Dolan" <christopher.dolan@avid.com>
> wrote:
>> I've experienced occasional cases where clients get stuck in the
>> following block of code in Mux.start. Has anyone experienced this
>> problem? I have a proposed solution below. Has anyone thought about a
>> similar solution already?
>> 
>> -- Current code --
>> 1 asyncSendClientConnectionHeader();
>> 2 synchronized (muxLock) {
>> 3 while (!muxDown && !clientConnectionReady) {
>> 4 try {
>> 5 muxLock.wait(); // REMIND: timeout?
>> 6 } catch (InterruptedException e) {
>> 7 ...
>> 8 }
>> 9 }
>> 10 if (muxDown) {
>> 11 IOException ioe = new IOException(muxDownMessage);
>> 12 ioe.initCause(muxDownCause);
>> 13 throw ioe;
>> 14 }
>> 15 }
>> 
>> -- Explanation of the code --
>> This code handles the initial client-server handshake that starts a
> JERI
>> connection. In line 1, the client sends its 8-byte greeting to the
>> server. Then in the loop on lines 3-9, it waits for the server's
>> response. If the reader thread gets a satisfactory response from the
>> server, it sets clientConnectionReady=true and calls
>> muxLock.notifyAll(). In all other cases (aborted connection,
> mismatched
>> protocol version, etc) the reader invokes Mux.setDown() which sets
>> muxDown=true and calls muxLock.notifyAll(). In lines 10-14, it throws
> if
>> the handshake was a failure.
>> 
>> In my scenario (which uses simple TCP sockets, nothing fancy), the
>> invoker thread sits on line 5 indefinitely. My problem hard to
>> reproduce, so I haven't found out what the server is doing in this
> case.
>> I hope to figure that out eventually, but presently I'm interested in
>> the "REMIND: timeout?" comment.
>> 
>> -- Timeout solution --
>> It seems obvious to me that there should be a timeout here. There are
>> lots of imaginable cases where the client could get stuck here:
>> server-side deadlock, abrupt server crash, logic error in client Mux
>> code. You'd expect that the server would either respond with its
> 8-byte
>> handshake very quickly or never, so a modest timeout (like 15 or 30
>> seconds) should be good. If that timeout is triggered, I would expect
>> that the code above would call Mux.setDown() and throw an IOException.
>> That exception would either cause a retry or be thrown up to the
> invoker
>> as a RemoteException.
>> 
>> -- Proposed code (untested) --
>> 3 long now = System.currentTimeMillis();
>> 4 long endTime = now + timeoutMillis;
>> 5 while (!muxDown && !clientConnectionReady) {
>> 6 if (now >= endTime) {
>> 7 setDown("timeout waiting for server to respond
>> to handshake", null);
>> 8 } else {
>> 9 try {
>> 10 muxLock.wait(endTime - now);
>> 11 now = System.currentTimeMillis();
>> 12 } catch (InterruptedException e) {
>> 13 setDown("interrupt waiting for connection
>> header", e);
>> 14 }
>> 15 }
>> 16 }
>> 
>> This code assumes a configurable timeoutMillis parameter has been set
>> earlier.
>> 
>> I can't think of any alternative solutions. Putting the timeout in the
>> Reader logic seems higher risk. There's incomplete code in JERI to
>> implement a ping packet (see Mux.asyncSendPing, never used), but that
>> would only be relevant after the initial handshake and wouldn't help
>> here.
>> 
>> Thanks,
>> Chris
>> 

Michael McGrady
Chief Systems Architect
Topia Technology, Inc.
Cel 1.253.720.3365
Work 1.253.572.9712 extension 2037
Skype ID: michael.mcgrady5
mmcgrady@topiatechnology.com




Mime
View raw message