db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Knut Anders Hatlen <Knut.Hat...@Sun.COM>
Subject Re: Possible CPU loop in testProtocol in cross-release scenario?
Date Sat, 17 Jun 2006 01:33:17 GMT
Knut Anders Hatlen <Knut.Hatlen@Sun.COM> writes:

> Bryan Pendleton <bpendleton@amberpoint.com> writes:
>
>> I was trying to verify that the changes in DERBY-920 hadn't
>> introduced any new compatibility problems (they shouldn't, because
>> we were changing an internal class, but I wanted to make sure).
>>
>> So I was trying to follow some old tips about how to run tests with
>> an old client against a new server, as documented in:
>> http://wiki.apache.org/db-derby/TestingOldClientNewServer
>>
>> However, when I did this, the test "testProtocol" did not terminate.
>> Instead, it entered a hard CPU loop consuming 100% of my machine,
>> until I killed the test process.
>>
>> Can somebody please try these steps (from the wiki page) against
>> the current trunk and see whether or not you get the same CPU loop
>> problem that I got?
>
> Hi Bryan,
>
> I also see this. However, the CPU usage is more like 0% than 100%. I
> think this is caused by the pre-fetching that was added to the network
> client in DERBY-822. If you put this at the end of values1.inc
>
>     readReplyDss
>     readLengthAndCodepoint QRYDTA
>     skipBytes
>
> or simply
>
>     skipDss
>
> testProtocol will terminate successfully. 
>
> This is not a compatibility issue, since the network client (also the
> 10.1 client) knows that a QRYDTA object may or may not arrive. The
> protocol test, on the other hand, is written with a specific version
> of the server in mind, with the expected server response
> hard-coded. Since the 10.1 version of the test doesn't expect QRYDTA
> from an OPNQRY, the actual server response and the expected response
> will be out of sync.
>
> I'm not sure what causes the hang, though. The test that is hanging is
> "Test for too large value for OUTEXP in EXCSQLSTT", and it happens in
> the first skipDss in connect.inc. When this test starts, there is at
> least a left-over QRYDTA from the previous test (and perhaps more
> since that one doesn't read any data sent from the server). But that
> should not be a problem since the endTest command is supposed to close
> the socket and streams and open new ones. We should probably look into
> why this is happening, at least so that we can eliminate that there's
> something wrong with the network server code.

It seems like the hang is in fact caused by a bug in
DDMReader. dssIsChainedWithDiffID and dssIsChainedWithSameID are not
reset in initialize(), so values from a previous session might be
used. In this particular case, the old values made
DDMReader.getCurrChainState() return an incorrect value, which again
made calls to DDMWriter.finalizeChain() a no-op. The protocol test
therefore never sent the DRDA commands it thought it had sent, and
both sides ended up waiting for data from the other one.

It is not clear to me whether this bug actually can be triggered
outside the protocol test since the client driver doesn't use
DDMReader/DDMWriter. The way DRDAConnThread uses those classes should
be safe, as the server always reads a command before responding and
the fields seem to be set to reasonable values at each read.

Anyway, setting those two fields to false in DDMReader.initialize()
made the hang go away, and derbynetclientmats runs cleanly with that
change.

-- 
Knut Anders

Mime
View raw message