db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From derby-...@db.apache.org
Subject [jira] Created: (DERBY-35) DRDA Chaining in Network Server is incorrect
Date Fri, 08 Oct 2004 20:56:51 GMT

  A new issue has been created in JIRA.

View the issue:

Here is an overview of the issue:
        Key: DERBY-35
    Summary: DRDA Chaining in Network Server is incorrect
       Type: Bug

     Status: Open
   Priority: Major

    Project: Derby
             Network Server

   Assignee: A B
   Reporter: A B

    Created: Fri, 8 Oct 2004 1:54 PM
    Updated: Fri, 8 Oct 2004 1:54 PM
Environment: Network drivers/clients other than the IBM DB2 JDBC Universal driver, when run
against Network Server.

I have come across several instances where Network Server can break DRDA chaining protocol
and can thus cause (typically intermittent) connection/communication failures for non-IBM-JDBC-Universal
clients (in particular, the problems have been seen with DB2's CLI client and with the .NET
ODBC provider).

The problems don't appear to surface when using the IBM DB2 JDBC Universal driver (i.e. the
Java driver that is most typically used with Network Server)--I don't know the specifics of
why not, but it seems to be the case that the universal driver isn't as strict about enforcing
DRDA chaining protocol as other clients.

[ NOTE: "DSS" here is a DRDA term.  It stands for "Data Stream Structure" and is, in layman's
terms, a structured message that is passed between the client and the server. ]

Some background:

I -- The DDMReader recognizes chaining on requests from the client through use of the reader.isChainedWith<Same/Diff>ID()
method, which indicates whether or not the current DSS being read is chained to the _FOLLOWING_
DSS (the one to be read next).

II -- The DDMWriter enforces chaining on replies through use of a "chain bit" in the header
of the reply DSS.  If two replies AR and BR are chained, then the reply Header of AR has to
indicate whether is it chained to BR, and if so, it has to indicate whether BR will have the
SAME correlation id or a DIFFERENT correlation id.  The chain bit must be set according to
the chaining of the request DSSes (as determined from the reader.isChainedWith<...>ID()
methods) to which we're responding.  DDM Writer currently sets the chaining bit based on a
"reuseCorrId" flag that it receives from DRDAConnThread at the time of the write.  That flag
indicates whether or not the the current DSS (the one being written) should have the same
correlation id as the _PRECEDING_ DSS (the one we most recently wrote). 

That said, the intermittent connection/communication failures that are showing up with non-IBM-Universal
drivers are caused by two factors:

1) There are several places in the DRDAConnThread code where the "reuseCorrId" flag that is
passed to DDMWriter is incorrect (it doesn't take chaining of the requests into account).
 This leads to incorrect chaining of the reply DSSes, which can then lead to problems for
the client when the client tries to process the reply (the client expects the replies to be
chained in a specific way, and if Network Server doesn't do it, the client can choke).

2) Currently, DDMWriter doesn't set the chaining bits for a reply DSS until the NEXT reply
DSS has begun (see "createDss<...>" methods in the DDMWriter class, for example).  At
the same time, there are a handful of calls to "send()" in the DRDAConnThread class, and those
calls tell DDMWriter to flush everything it has written to the client.  This is a problem:
if, for example, DDMWriter has written some reply DSS AR, the chaining bits for AR won't get
set until the next DSS is created--so if we up and call "send()", we'll end up sending AR
across the wire with it's chaining bits UNSET.  This isn't a problem if AR is NOT supposed
to be chained to anything after it--because the default chaining is "none".  However, if DRDAConnThread
calls "send()" in the middle of a chain, the last reply written is going to have incorrect
chaining info, and that can cause problems on the client.

Whether or not the client actually chokes on the incorrect chaining bits is intermittent:
the reason is that this all depends on how the network packets are buffered and on the relative
speed of the CPU's of the client and server.  That said, the problem is typically more reproducible
across machines (i.e. if the client and server are two different machines).

This message is automatically generated by JIRA.

If you think it was sent incorrectly contact one of the administrators:

If you want more information on JIRA, or have a bug to report see:

View raw message