jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: Lucene and deleted nodes
Date Thu, 16 Jun 2005 08:41:37 GMT
Hi Fabrizio,

Fabrizio Giustina wrote:
> While accessing the results of a jcr search I often experience
> problems probably due to the presence, in the lucene index, of deleted
> nodes.

Are you able to reproduce this error in a single session environment? 
That is, no other session is modifying the workspace at the same time. 
If you do, then this is clearly a bug as you described it. Can you then 
please open a jira issue? Thanks

It is also possible that nodes get deleted after the search has 
completed calculating the uuids of the result nodes.
The current implementation of the query result calculates the actual 
order of the result nodes when hasNext() is called for the first time. 
That means one of the result nodes might get deleted in the meantime.

A possible solution to this is probably to fetch all the nodes in in the 
query result right from the beginning when it is created by the query.
But I have to think about this more thoroughly if this really solves the 
issue.

> This is actually a big problem since a missing reference can block
> from accessing all the search results (the error is thrown in
> NodeIterator.hasNext, while iterating on the search result).
> The following is the stacktrace for the error:

Do you know if you get same error when you simply execute the query 
again? This would indicate that the index is really out of sync with the 
workspace.

> ERROR  org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl 
> DocOrderNodeIteratorImpl.java(compare:186) 10.06.2005 17:28:54 
> Exception while sorting nodes in document order:
> javax.jcr.ItemNotFoundException: 2537b990-5387-4b5b-a15a-8db53d4353e1
> javax.jcr.ItemNotFoundException: 2537b990-5387-4b5b-a15a-8db53d4353e1
> 	at org.apache.jackrabbit.core.ItemManager.createItemInstance(ItemManager.java:518)
> 	at org.apache.jackrabbit.core.ItemManager.getItem(ItemManager.java:372)
> 	at org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl$1.compare(DocOrderNodeIteratorImpl.java:142)
> 	at java.util.Arrays.mergeSort(Arrays.java:1307)
> 	at java.util.Arrays.mergeSort(Arrays.java:1296)
> 	at java.util.Arrays.sort(Arrays.java:1223)
> 	at org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl.initOrderedIterator(DocOrderNodeIteratorImpl.java:136)
> 	at org.apache.jackrabbit.core.query.lucene.DocOrderNodeIteratorImpl.hasNext(DocOrderNodeIteratorImpl.java:95)
> 
> 
> while doing:
> NodeIterator nodeIterator = (javax.jcr.query.QueryResult)result.getNodes();
> while (nodeIterator.hasNext()) {
> ...
> 
> I am trying to understand if the problem is due to a someway corrupted
> lucene index, or if this situation should be handled by jackrabbit (or
> by the user? but it sounds strange since the error is thrown from the
> iterator).

This is because the nodes are resolved lazily from uuids returned by the 
  query result.

> Did anyone see a similar problem? How deleted/missing nodes are
> supposed to be handled in the search index?

well, the index and the workspace should actually be in sync at any 
time. so, this shouldn't happen. if it does, then its a bug.
but as mentioned above, it is possible that result nodes get deleted in 
the meantime and because of the lazy loading you might run into this error.

regards
  marcel

Mime
View raw message