db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-3493) stress.multi times out waiting on testers with blocked testers waiting on the same statement
Date Fri, 07 Mar 2008 15:21:47 GMT

     [ https://issues.apache.org/jira/browse/DERBY-3493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Knut Anders Hatlen updated DERBY-3493:
--------------------------------------

    Attachment: d3493-1a.diff

Attaching a patch which I believe solves the hang.

The patch basically makes ConcurrentCache.create() use ConcurrentHashMap.get() directly instead
of going through ConcurrentCache.getEntry(), which will block until the identity has been
set. Then create() fails immediately if the object already exists in the cache. Since this
introduced yet another difference between find() and create() in findOrCreateObject(), I also
followed Øystein's suggestion from his review of DERBY-2911 and split findOrCreateObject()
into a number of smaller methods, which I think makes the code easier to follow.

I have started the full regression suite (which seems to run fine) and will also have stress.multi
running in a loop for some time to verify that the hang really has been fixed.

The hang seems to have been caused by the two table descriptor caches in DataDictionaryImpl
(nameTdCache and OIDTdCache) trying to keep each other consistent. So when you insert an object
into one of these caches, their setIdentity() methods try to automatically insert it into
the other one as well. So what happened was that one thread inserted an object into one of
the caches, and at the same time another thread inserted an object with the same identity
into the other cache. Both of the caches tried to update the same object in the other cache
at the same time and thereby they ended up waiting for each other to finish. Since creating
an object that already exists should fail, there's no reason to wait for a not fully initialized
object to become fully initialized before failing. By failing as soon as such a situation
is detected, the two threads don't wait for each other to finish, and the deadlock is avoided.

> stress.multi times out waiting on testers with blocked testers waiting on the same statement
> --------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3493
>                 URL: https://issues.apache.org/jira/browse/DERBY-3493
>             Project: Derby
>          Issue Type: Bug
>          Components: Regression Test Failure, SQL, Test
>    Affects Versions: 10.4.0.0
>         Environment: IBM 1.5 Linux
>            Reporter: Kathey Marsden
>            Assignee: Knut Anders Hatlen
>         Attachments: d3493-1a.diff, threaddump-1204806990660.tdump
>
>
> The diff is:
> 7 del
> < ...running last checks via final.sql
> 7 add
>  > ...timed out trying to kill all testers,
>  >    skipping last scripts (if any).  NOTE: the
>  >    likely cause of the problem killing testers is
>  >    probably not enough VM memory OR test cases that
>  >    run for very long periods of time (so testers do not
>  >    have a chance to notice stop() requests
> Test Failed.
> The testers that are stuck are stuck on the same statement e.g.
> -- 
> update main2 set y = 'zzz' where x = 5;
> ERROR 08000: Connection closed by unknown interrupt.
> ERROR XJ001: Java exception: ': java.lang.InterruptedException'.
> The interupt exception shows:
> java.lang.InterruptedException
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Object.wait(Object.java:199)
>         at
> org.apache.derby.impl.sql.GenericStatement.prepMinion(GenericStatement.java:195)
>         at
> org.apache.derby.impl.sql.GenericStatement.prepare(GenericStatement.java:88)
>         at
> org.apache.derby.impl.sql.conn.GenericLanguageConnectionContext.prepareInternalStatement(GenericLanguageConn
> ctionContext.java:768)
>         at
> org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:606)
>         at
> org.apache.derby.impl.jdbc.EmbedStatement.execute(EmbedStatement.java:555)
>         at org.apache.derby.impl.tools.ij.ij.executeImmediate(ij.java:329)
>         at
> org.apache.derby.impl.tools.ij.utilMain.doCatch(utilMain.java:508)
>         at
> org.apache.derby.impl.tools.ij.utilMain.runScriptGuts(utilMain.java:350)
> The code at line 195 of GenericStatement shows:
>           ....
>                 try {
>                     preparedStmt.wait();
>                 } catch (InterruptedException ie) {
>                     throw StandardException.interrupt(ie);
>                 }
> My first guess is that this is perhaps some type of Statement cache
> concurrency bug, but perhaps
> I am reading it wrong.  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message