curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <jor...@jordanzimmerman.com>
Subject Re: distributed locking issue
Date Fri, 07 Apr 2017 17:00:36 GMT
A few things:

* What's the purpose of "numLocks"? It's always 1 (as it should be)

* On line 59 it should be: long end = System.nanoTime();

* System.nanoTime() "provides nanosecond precision, but not necessarily nanosecond resolution".
Your lock is immediately released and, thus, you cannot really rely on the clocks being right.
See: http://www.principiaprogramatica.com/?p=16 <http://www.principiaprogramatica.com/?p=16>

* I'm not sure this actually tests anything. The safest way to judge whether two threads hold
the lock is to introduce an AtomicBoolean in your code. E.g, on line 57 add something like:

	if ( !debugIsLocked.compareAndSet(false, true) ) {
		throw new IllegalStateException("another thread holds the lock");
        }

Then, set it to false in your finally block. 

-JZ

> On Apr 7, 2017, at 9:09 AM, Amit Dalal <dalal.amit@snapdeal.com> wrote:
> 
> Hi Jordan,
> 
> We have been using ApacheCurator for distributed locking in production environment for
a couple of years. Recently, we found a case wherein same lock was acquired by two threads
at the same time. We tried reproducing the case via a main method and were able to do it.
> 
> Here's the testcase: https://gist.github.com/amdalal/bf993fa9d2e2770663959b0f940bdb8f
<https://gist.github.com/amdalal/bf993fa9d2e2770663959b0f940bdb8f>
> 
> You might find the code familiar as you had answered a question around it on Stackoverflow
(http://stackoverflow.com/questions/29852353/apachecurator-distributed-locking-performance
<http://stackoverflow.com/questions/29852353/apachecurator-distributed-locking-performance>).
> 
> Attached are the logs from the testcase. We noticed that around 0.4% times, same lock
was acquired by multiple threads as there are multiple instances like below in the logs.
> 
> T2:101:ACQUIRED:28518535141438:in 2 ms
> T1:101:ACQUIRED:28518535406490:in 1 ms
> T2:101:RELEASED:28518535421748:in 0 ms
> T1:101:RELEASED:28518535649433:in 0 ms
> 
> T1:101:ACQUIRED:28518703211491:in 1 ms
> T2:101:ACQUIRED:28518703461041:in 0 ms
> T1:101:RELEASED:28518703468657:in 0 ms
> T2:101:RELEASED:28518703690163:in 0 ms
> 
> Any idea what is going wrong? Would appreciate your time.
> 
> Thanks,
> Amit
> <zookeeper-test.log>


Mime
View raw message