db-ojb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Armin Waibel <arm...@apache.org>
Subject Re: Lock Failure on Postgresql
Date Tue, 22 Jun 2004 16:31:47 GMT
Hi Martin, Robert,

the LockingMultithreadeTest was implemented to check locking in 
multithreaded environment. I know it's critical to add such tests 
(dependent on underlying hardware) but I don't want to keep these tests 
separate, because otherwise they will be ignored (by developer and user) ;-)

As a side note this test runs on a AMD Athlon 1200, 512MB RAM and 
sapDB/maxDB without problem, seems cygwin or PostgreSQL has problems in 
multithreaded environments ;-)

But how should we solve the LockingMultithreadeTest problem?
changing of test variables, remove test from test-suite, ...?

By the way, it seems that we have some strange odmg-junit test failures. 
If I run the odmg-tests separately all test pass. I get two kinds of 
error messages:
1. LockNotGrantedException
2. KeyContraintViolation

The first one could be a result of an dirty test case. E.g. begin a tx 
and lock objects, but never commit or abort the tx. Then objects will be 
locked till they timed out.

The second one is more serious. As listed in the release-notes odmg-api 
has problems with m:n relation. I assume the KeyContraintViolation are a 
  side-effect of that problem, when objects with m:n relations involved 
in test cases. But I'm not sure.
I do a little refactoring of odmg-tests to keep them more separate from 
PB-test. Maybe this will help to find the real problem.

regards,
Armin

Martin Kalén wrote:

> 
> Robert S. Sfeir wrote:
> 
>> Martin, I can reproduce this on my machine consistently with Postgresql
>> 7.4.3.  I'm running OS X on a 1.5 Ghz powerbook with 1 gig of RAM.  When
>> running the unit tests that's the only thing running on my machine.
> 
> 
> I now tried this setup:
> Machine: Win XP Pro SP1, AMD Athlon XP 2700+, 1GB DDR266 RAM (~650MB used)
> RDBMS: PostgreSQL 7.4.1-3 (cygwin)
> JDBC: PostgreSQL Native Driver 7.4.2 JDBC2 with SSL (build 213)
> 
> Setting threads=16 on line 60 in LockingMultithreadeTest gives this (3 
> runs):
> 
> 1) [junit] Tests run: 196, Failures: 0, Errors: 10, Time elapsed: 15,937 
> sec
> 2) [junit] Tests run: 196, Failures: 0, Errors: 11, Time elapsed: 15,703 
> sec
> 3) [junit] Tests run: 196, Failures: 0, Errors: 11, Time elapsed: 15,718 
> sec
> 
> Setting thread=5 gives:
> 
> 1) [junit] Tests run: 196, Failures: 0, Errors: 0, Time elapsed: 15,515 sec
> 2) [junit] Tests run: 196, Failures: 0, Errors: 0, Time elapsed: 15,375 sec
> 3) [junit] Tests run: 196, Failures: 0, Errors: 0, Time elapsed: 15,516 sec
> 
> 
> 
> The reason it seems reproducable on your machine I believe to be a 
> result only of your JVM not beeing able to do all the synchronization 
> overhead within 100 tries of "try to aquire lock, then counter++ and 
> sleep(10ms)".
> 
> 
>> I've just downloaded the demo of jprofiler, which works on the Mac, 
>> and will
>> do automatic deadlock detection.
> 
> 
> Unfortunately this would be more of a livelock - all the threads are 
> alive, otherwise you would not see those "thread [x] waited [>=70% 
> threashold of max] times, max is 100" messages.
> 
> But I think of it as a race condition and not X-lock; all the threads 
> are racing for the lock full speed without synchronization on the 
> test-case condition (lock aquire within 100 tries).
> 
>> I have a funny feeling about this issue, and though it might well be the
>> test case, I want to find out for sure.  My hunch is there is a thread
>> locking issue, however subtle.
> 
> 
> The locking issue is the test case, I'm pretty sure. Look at line 148:
> private static final int maxAttemps = 100;
> 
> And then the semantics for grabbing a lock in a thread:
>     try
>             {
>                 tx.lock(obj, Transaction.WRITE);
>                 updateName(obj);
>                 updateName(obj.getReference());
>             }
>             catch (LockNotGrantedException e)
>             {
>                 if (counter < maxAttemps)
>                 {
>                     counter++;
>                     if (counter > nearMax)
>                         LoggerFactory.getDefaultLogger().warn("### 
> thread " + threadNumber
>                                 + " waits " + counter + " times. Maximal 
> attempts are " + maxAttemps);
>                     Thread.sleep(10);
>                     updateObject(tx, obj);
> 
> 
> JUnitExtensions.runTestCaseRunnables will first create all threads in 
> strict order,
> then start all threads in strict order, then join all threads in strict 
> order.
> 
> This means threads with lower id:s has higher chance of getting the 
> lock, but
> there is nothing to synchronize/determine when other threads have waited 
> "too long".
> 
> Thread 1 will most certainly grab the lock and thread 2 will first fail 
> (LockNotGrantedException). The counter in thread 2 will be increased and 
> it will
> sleep up to 10ms and then do a recurisve call and try again.
> 
> All this while other threads are beeing started ang trying to grab the 
> lock.
> 
> 
> Where is the reason in saying "if failing more than 100 times it failed 
> too many times"?
> I guess 100 represent "many" in some sense, but again; it depends on 
> hardware and
> current machine/RDBMS load. I've learnt to be suspicious of constants 
> other than 0, 1 or infinity - Murphy's law will always hit you in those 
> places.
> 
> 
> Of course I might be completely wrong - this is a 100% deterministic 
> feature of my postings. ;-)
> 
> It will be interesting to see what jprofiler gives!
> 
> Regards,
>  Martin
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: ojb-dev-unsubscribe@db.apache.org
For additional commands, e-mail: ojb-dev-help@db.apache.org


Mime
View raw message