jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers (JIRA)" <j...@apache.org>
Subject [jira] Commented: (JCR-1213) UUIDDocId cache does not work properly because of weakReferences in combination with new instance for combined indexreader
Date Wed, 21 Nov 2007 11:34:43 GMT

    [ https://issues.apache.org/jira/browse/JCR-1213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544424

Ard Schrijvers commented on JCR-1213:

"http://jackrabbit.apache.org/doc/arch/operate/index-readers.html " 

This is really really nice to read! I just only now understand, the deleted BitSet idea, and
that SharedIndexReader lives during the entire life of a PersistentIndex.  Since in 'older'
(existing?) indexes only documents can be deleted, we keep the same SharedIndexReader  even
if the index changed, and keep track of the deleted items in the deleted BitSet. And this
should give us a very good possibility to cache the parent relationships of a node, even though
underlying lucene indexes change because of the deletion of a document (I am pretty much recapitulating
and adding nothing new, and certainly nothing you don't already know, but see it as thinking
out loud  :-) )

I'll try to look at the end of the day at your suggestion at the end of your comment, and
also see if i can get a 'hacky' working version with the readersByBase. A nice solution can
be made afterwards if it is a succes. After reading your documentation I am really confident
we can make a very efficient cache for the docNumbers.  I'll investigate as well.... :-) 

> UUIDDocId cache does not work properly because of weakReferences in combination with
new instance for combined indexreader 
> ---------------------------------------------------------------------------------------------------------------------------
>                 Key: JCR-1213
>                 URL: https://issues.apache.org/jira/browse/JCR-1213
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: query
>    Affects Versions: 1.3.3
>            Reporter: Ard Schrijvers
>             Fix For: 1.4
> Queries that use ChildAxisQuery or DescendantSelfAxisQuery make use of getParent() functions
to know wether the parents are correct and if the result is allowed. The getParent() is called
recursively for every hit, and can become very expensive. Hence, in DocId.UUIDDocId, the parents
are cached. 
> Currently,  docId.UUIDDocId's are cached by having a WeakRefence to the CombinedIndexReader,
but, this CombinedIndexReader is recreated all the time, implying that a gc() is allowed to
remove the 'expensive' cache.
> A much better solution is to not have a weakReference to the CombinedIndexReader, but
to a reference of each indexreader segment. This means, that in getParent(int n) in SearchIndex
the return 
> return id.getDocumentNumber(this) needs to be replaced by return id.getDocumentNumber(subReaders[i]);
and something similar in CachingMultiReader. 
> That is all. Obviously, when a node/property is added/removed/changed, some parts of
the cached DocId.UUIDDocId will be invalid, but mainly small indexes are updated frequently,
which obviously are less expensive to recompute.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message