db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-3909) Race condition in NetXAResource.removeXaresFromSameRMchain()
Date Wed, 15 Oct 2008 13:07:44 GMT

     [ https://issues.apache.org/jira/browse/DERBY-3909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Knut Anders Hatlen updated DERBY-3909:
--------------------------------------

    Affects Version/s: 10.2.2.0

A SailFin user reported this problem against Derby 10.2. See https://sailfin.dev.java.net/issues/show_bug.cgi?id=1218.
He also said that the problem went away if he used 10.4.2.0 instead.

I see something similar with the repro I attached here. On 10.2 and 10.3, the repro ends up
in an infinite loop in initForReuse() after 10-20 seconds. On 10.4 and trunk, the repro just
slows down.

A binary search in the repository revealed that the behaviour changed with this check-in:

------------------------------------------------------------------------
r632334 | kristwaa | 2008-02-29 15:59:32 +0100 (fr, 29 feb 2008) | 3 lines

DERBY-3441: Determine and implement a proper procedure for resetting a prepared statement
for reuse in a statement pool.
Patch file: derby-3441-1c-statement_reset.diff

------------------------------------------------------------------------

This check-in appears to change the timing of establishing and reusing XA connections, most
likely because of the changes to client.am.Connection.completeReset() which seem to change
how the transaction isolation level is reset. There's also a TODO comment which says we should
investigate other ways to do it.

Although I don't see the infinite loop after that commit, the potential for the infinite loop
is still there, and I still see that the vector keeps growing.

> Race condition in NetXAResource.removeXaresFromSameRMchain()
> ------------------------------------------------------------
>
>                 Key: DERBY-3909
>                 URL: https://issues.apache.org/jira/browse/DERBY-3909
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Client
>    Affects Versions: 10.2.2.0, 10.5.0.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>         Attachments: d3909-remove.diff, d3909.diff, Derby3909.java
>
>
> NetXAResource.removeXaresFromSameRMchain() does the following to remove a NetXAResource
from what's logically a singly-linked list:
> 1) Mark the NetXAResource to remove with a flag (a field called ignoreMe_)
> 2) Synchronize on an object that protects the linked list
> 3) Follow the chain of next pointers in the linked list and remove the first flagged
object
> 4) Release synchronization lock obtained in (2)
> 5) Clear the flag set in (1)
> Now, say that two threads (T1 and T2) perform step 1 in parallel. T1 is granted the synchronization
lock in (2), and T2 must wait. T1 traverses the linked list, finds the object flagged by T2
and removes it. Further T1 releases the synchronization lock and clears the flag on the object
it had flagged. This is not the same object that it removed, so when T2 is granted the synchronization
lock, there is no object flagged for removal. As a result, only the object T2 attempted to
remove was in fact removed. The object that T1 flagged for removal is still in the linked
list.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message