Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Derby Development" <derby-dev@db.apache.org>
Received-SPF: pass (hermes.apache.org: domain of jonas.s.karlsson@gmail.com
 designates 64.233.170.200 as permitted sender)
Message-ID: <e820052e04101111081f1fff20@mail.gmail.com>
Date: Mon, 11 Oct 2004 11:08:14 -0700
From: Jonas S Karlsson <jonas.s.karlsson@gmail.com>
Reply-To: jsk@lysator.liu.se
To: Derby Development <derby-dev@db.apache.org>
Subject: Re: SPAM=**** Help detecting client disconnects for network server
In-Reply-To: <41672595.4040909@Sourcery.Org>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
References: <41672595.4040909@Sourcery.Org>

My mail seems to have some delivery problems, so I'm using this
account, on saturday evening I wrote, apologies for eventual
duplicates.

-----------------------------------------------------------------

I'm no expert on DRDA but I've played and written some network
applications, so I have some ideas...

Kathey Marsden wrote:
> I am working on an issue with Network Server where the server does not
> clean up connection threads properly for disconnected clients.  I get an
> IOException if the client program is killed or aborted with <ctrl> c
> but if the network cable is just unplugged or the machine turned off, we
> don't detect the disconnect properly.

There is no disconnection, the concept of a "network connection" is
somewhat not a good allegory, because there isn't really any
connection. Disconnection of a TCP connection is an active event by a
program/OS, and requires communication. If nobody is there to
communicate, or the communcation channel is broken, then nobody is
there to disconnect. The OS closes connections/files when a program
terminates. That's why you see it. There is generally no way to verify
that the other end is there, unless it tells you that it is. There is
an automatic disconnect after a while of "unused" sockets, but this
could be as long as "hours" (OS? implementation dependent), and can be
turned on/off using .SetKeepAlive(), however you can't set the time...

> So, if I do this on the client machine
>
> create table t (i int);
> autocommit off;
> lock table t in exclusive mode;
> 
> Then disconnect the cable, the server will continue to block on the
> inputStream.read() and the connection will continue to hold the lock so
> no one else can select from the table.

...and if you connect the cable again, you're likely to be able to
communicate again (assuming static IP-address etc). This is a good
thing because there may be seconds when packages don't reach their
target and are being resent, or routed different routes, and the
"connection" is then retained, unless it times out. This is the normal
operation of internet/networks, it hickups, a router is reset, etc.

> Reading a little about this it seems the only way to really detect if
> the socket is active is to attempt a write. This of course is not an
> option since it will mess up the drda protocol.  Other things I have
> looked at are Socket.setSoTimeout, which seems no good because the
> client might in fact just be sitting there doing nothing for a long time.

That a client is doing nothing for a long time and being in a
transaction with the server seems to be the case you want to catch and
I feel that you just provided a solution yourself. When being inside a
transaction (i.e, there are locked objects in the transaction) set the
Socket.setSoTimeout to an acceptable value, when a timeout is
received, rollback the transaction. Typically, one would like the
server to be configured, or the client to be able to set what timeout
to use, I guess. After a commit/rollback, I'd expect no objects to be
locked, thus timeout can be set really long (=0 == infinite).

Another solution, would be to require the client to "ping" the server
at regular intervals, kind of like a remote "watchdog" process, the
server clears a flag for that client at any communcation, keeps the
time when it last recieved communication.  A server watchdog thread
can then at regular time interval (x times longer than the interval at
the client) check that the the flag is cleared/time is acceptable, and
if not "kill" the client connection. If the client is ok, the flag is
set by the client. The "ping" would be any "cheap" request that the
server can respond easily to, like a "status" function, "get variable
value" or such, I don't know DRDA well enough to suggest any fitting
command.  Sometimes a "good command" is to send an illegal
command/string which will give an error from the server, that too, is
communication of that the client is alive, and should be "cheap" for
the server. Logging excluded.

Hope the ideas are of some use, I believe that one could do things on
lower levels, outside the TCPIP connection, but that would require
much more work, and be roughly equivalent.

/Jonas (jsk@yesco.org)