jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Mueller (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-1605) RepositoryLock does not work on NFS sometimes
Date Fri, 16 May 2008 12:41:56 GMT

    [ https://issues.apache.org/jira/browse/JCR-1605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12597450#action_12597450
] 

Thomas Mueller commented on JCR-1605:
-------------------------------------

> Does this need to be configurable? 

Yes, unless we only want to support the new mechanism.

> Wouldn't it be sufficient to catch the Exception and then fall back to the new approach?

No. The file system may not always throw an exception. The message "No locks available" sounds
like there is a number of locks, and if more locks are used this exception occurs. This would
mean that sometimes the new algorithm is used and sometimes the old. This wouldn't work correctly.

The new mechanism I have in mind is a cooperative algorithm. This algorithm is already in
use in the H2 database. It goes like this:

*  When the lock file does not exist, it is created (using the atomic operation File.createNewFile).
Then, the process waits a little bit (20ms) and checks the file again. If the file was changed
during this time, the operation is aborted. This protects against a race condition when a
process deletes the lock file just after one create it, and a third process creates the file
again. It does not occur if there are only two writers.

* If the file can be created, a random number is inserted. Afterwards, a watchdog thread is
started that checks regularly (every second once by default) if the file was deleted or modified
by another (challenger) thread / process. Whenever that occurs, the file is overwritten with
the old data. The watchdog thread runs with high priority so that a change to the lock file
does not get through undetected even if the system is very busy. However, the watchdog thread
does use very little resources (CPU time), because it waits most of the time. Also, the watchdog
only reads from the hard disk and does not write to it.

* If the lock file exists, and it was modified in the 20 ms, the process waits for some time
(up to 10 times). If it was still changed, an exception is thrown ("locked"). This is done
to eliminate race conditions with many concurrent writers. Afterwards, the file is overwritten
with a new version (challenge). After that, the thread waits for 2 seconds. If there is a
watchdog thread protecting the file, he will overwrite the change and this process will fail
to lock. However, if there is no watchdog thread, the lock file will still be as written by
this thread. In this case, the file is deleted and atomically created again. The watchdog
thread is started in this case and the file is locked. 



> RepositoryLock does not work on NFS sometimes
> ---------------------------------------------
>
>                 Key: JCR-1605
>                 URL: https://issues.apache.org/jira/browse/JCR-1605
>             Project: Jackrabbit
>          Issue Type: Bug
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>
> The RepositoryLock mechanism currently used in Jackrabbit uses FileLock. This doesn't
work on some NFS file system. It looks like only NFS version 4 and newer supports locking.
Older implementations may throw a IOException "No locks available", which means the NFS does
not support byte-range locking.
> I propose to add a second locking mechanism, and add a configuration option to use it.
For example: <FileLocking class="acme" />. This second locking mechanism is a cooperative
locking protocol that uses a background (watchdog) thread and only uses regular file operations.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message