lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <>
Subject Re: help on Lock.obtain(lockWaitTimeout)
Date Fri, 22 Sep 2006 10:49:04 GMT
Yonik Seeley wrote:
> On 9/21/06, Michael McCandless <> wrote:
>> Anyway, my first reaction was to change this to use
>> System.currentTimeMillis() to measure elapsed time, but then I
>> remembered is a dangerous approach because whenever the clock on the
>> machine is updated (eg by a time-sync NTP client) it would mess up
>> this function, causing it to either take longer than was asked for (if
>> clock is moved backwards) or, to timeout in [much] less time than was
>> asked for (if clock was moved forwards).
> Um, wow... that's thorough design work!

Thanks :) I've hit just one too many bugs due to system time changing!
Time is always a sneaky thing to work with.  Basically you can't
really use system time as a reliable way to measure elapsed time.

> In this case, I don't think it's something to worry about though.
> NTP corrections are likely to be very small, not on the scale of
> lock-obtain timeouts.
> If one can't obtain a lock, it's due to something else asynchronously
> happening, and that throws a lot bigger time variation into the
> equation anyway.

Yes, I hope so, in a well-behaved server environment that's already
converged its clock and is tracking well to "real time", has the right
command line options to ntp, and doesn't have an admin coming in and
making clock changes.  But in more "chaotic" user's desktop where the
user could update the clock at random times themselves, it would be
horrible to let such an event "falsely" throw a Lock obtain timed out
to any desktop deployments of Lucene.

Even with lock-less commits we will still need to obtain the write
lock (eg for the interleaved add/delete case, until we can fix
IndexWriter to handle deletes, the write lock is being acquired fairly
"often").  Each of these obtains is then vulnerable if [too large] a
clock change is made during this call.

Lucene doens't currently have this issue (relying on currentTimeMillis
to measure elapsed time) so I'd hate to be the one to introduce it.

Are there any objections to the "acquire a random test lock" approach?

If your locking is mis-configured, you will get an error on
creating the NativeFSLockFactory.  But if it is configured
properly, it will quickly get the lock (and release it) and move on.

Also, there is a single instance of NativeFSLockFactory per [canonical]
lock directory, so it would only be the first time (per JVM instance)
that the NativeFSLockFactory is created for the given directory that
this simple test would be performed.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message