jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis van der Laan <d.g.van.der.l...@rug.nl>
Subject Re: Searching for a property
Date Thu, 17 Dec 2009 20:50:42 GMT
Hi Ard,
> Hello Dennis,
>
> On Fri, Dec 11, 2009 at 11:24 AM, Dennis van der Laan
> <d.g.van.der.laan@rug.nl> wrote:
>   
>> Hi Ard,
>>
>> Thanks! The performance went up by a factor x10. Still not what I hoped
>> for, but I'm not sure the query itself is still a problem.
>>     
>
> so now it is 100 ms? That is not to fast still. What is your query?
> Furthermore, of course, index size matters as well
>   
Triggered by your remark on index size, I created a new repository and
started filling it up with nodes which have a virtual path property
(cms:virtualPath). At a certain point, I see a significant degradation
of the performance. I made a thread dump to see what the VM was doing
and found this stack trace:

   java.lang.Thread.State: RUNNABLE
        at java.io.RandomAccessFile.readBytes(Native Method)
        at java.io.RandomAccessFile.read(RandomAccessFile.java:322)
        at
org.apache.lucene.store.FSDirectory$FSIndexInput.readInternal(FSDirectory.java:596)
        - locked <0x85523040> (a
org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor)
        at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:136)
        at
org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:247)
        at
org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157)
        at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:116)
        at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:92)
        at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:82)
        at
org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:127)
        at
org.apache.lucene.index.SegmentMergeInfo.next(SegmentMergeInfo.java:65)
        at
org.apache.lucene.index.MultiSegmentReader$MultiTermEnum.next(MultiSegmentReader.java:494)
        at
org.apache.lucene.search.FilteredTermEnum.next(FilteredTermEnum.java:67)
        at
org.apache.jackrabbit.core.query.lucene.CaseTermQuery$CaseTermEnum.<init>(CaseTermQuery.java:146)
        at
org.apache.jackrabbit.core.query.lucene.CaseTermQuery.getEnum(CaseTermQuery.java:53)
        at
org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:55)
        at
org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:383)
        at
org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:383)
        at
org.apache.jackrabbit.core.query.lucene.JackrabbitIndexSearcher.evaluate(JackrabbitIndexSearcher.java:99)
        at
org.apache.jackrabbit.core.query.lucene.JackrabbitIndexSearcher.execute(JackrabbitIndexSearcher.java:84)
        at
org.apache.jackrabbit.core.query.lucene.SearchIndex.executeQuery(SearchIndex.java:760)
        at
org.apache.jackrabbit.core.query.lucene.SingleColumnQueryResult.executeQuery(SingleColumnQueryResult.java:66)
        at
org.apache.jackrabbit.core.query.lucene.QueryResultImpl.getResults(QueryResultImpl.java:298)
        at
org.apache.jackrabbit.core.query.lucene.SingleColumnQueryResult.<init>(SingleColumnQueryResult.java:58)
        at
org.apache.jackrabbit.core.query.lucene.QueryImpl.execute(QueryImpl.java:131)
        at
org.apache.jackrabbit.core.query.QueryImpl.execute(QueryImpl.java:177)

Could this mean that there is not enough memory for the Lucene indexes
and the indexes are read from disk all the time?
Any idea how large the indexes will become? I have no idea how the
internals of Lucene look like. The virtual paths have an average string
length of about 50 characters and we end up having about 1 million of
these properties.

Thanks for any help!

Dennis
>   
>> A related question: could it be that when a query returns no results,
>> this is slower than when it does return a result? Might it have
>> something to do with Lucene not having an index for that particular
>> property value?
>>     
>
> No, an inverted index structure does not suffer from this
>
> Regards Ard
>
>   
>>> Hello Dennis,
>>>
>>>       


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message