jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Reutegger (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-905) Clustering: race condition may cause duplicate entries in search index
Date Tue, 10 Jul 2007 11:57:05 GMT

    [ https://issues.apache.org/jira/browse/JCR-905?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12511401

Marcel Reutegger commented on JCR-905:

This patch adds considerable overhead to the index process because for each added node the
index has to first check if the node already exists. In lucene terms this means that lots
of index readers and index writers are created and destroyed in a short period of time. The
current code relies on the fact that the events passed to the query handler reflect a correct
state change on the workspace. E.g. if an event says that a node is added, the index assumes
that the node does not exist in the index.

I see two ways to fix this issue:

- The query handler does not automatically re-index the workspace, but rather re-plays the
cluster-journal to get a valid index.
- The query handler needs to associate a journal revision with the current index state. When
journal events are processed the query handler will ignore events from the 'past'.

I prefer option 2.

> Clustering: race condition may cause duplicate entries in search index
> ----------------------------------------------------------------------
>                 Key: JCR-905
>                 URL: https://issues.apache.org/jira/browse/JCR-905
>             Project: Jackrabbit
>          Issue Type: Bug
>          Components: clustering
>    Affects Versions: 1.3
>            Reporter: Martijn Hendriks
>         Attachments: JCR-905.patch, log1.txt, log2.txt
> There seems to be a race condition that may cause duplicate search index entries. It
is reproducible as follows (Jackrabbit 1.3):
> 1) Start clusternode 1 that just adds a single node of node type clustering:test.
> 2) Shutdown clusternode 1.
> 3) Start clusternode 2 with an empty search index.
> 4) Execute the query  //element(*, clustering:test).
> 4) Print the result of the query (UUIDs of nodes in the result set).
> When I just run clusternode 2, then there is one node in the resultset, as expected.
However, when I debug clusternode 2 and have a breakpoint (i.e., a pause of a few seconds
at line 306 of RepositoryImpl.java - just before the clusternode is started), then the resultset
contains two results, both with the same UUID.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message