jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject RE: Re: Search performance : MultiIndex
Date Wed, 31 Oct 2007 09:55:16 GMT


> >>
> >> '/documents/en/news/2007/10/14/item.xml'
> >> '/documents/en/news/2007/10/14'
> >> '/documents/en/news/2007/10'
> >> '/documents/en/news/2007'
> >> '/documents/en/news'
> >> '/documents/en'
> >> '/documents'
> > 

> Christoph Kiehl wrote:
> My hope is that we find exactly that kind of clever cache ;) 
> I even think we should make this one persistent because you 
> have to read every node to build it and I think it might 
> consume to much memory for big repositories (at least if we 
> keep the paths human readable like above).

But having a persistent hierarchical cache would again imply that moving
a node with a large subtree below it, a large part of the hierarchical
(persistent) cache needs to be reloaded/re-persisted, making a move
expensive again, isn't? Also, I still think all the info is needed
within the query/index, and I am not sure wether the hierarchical cache
can help us with that. Filtering hits afterwards will be very slow, but
perhaps I am missing something

> 
> Maybe I have some time on Friday to think about it. But maybe 
> Ard is faster to come up with a solution ;))

Most likely not :-) I have done some performance testing yesterday, and
I must say the current implementation is faster than I thought. I really
thought I saw queries taking couples of seconds for only 1.000 hits with
DescendantSelfAxisQuery, but as I saw yesterday, it is a couple of
seconds (on the modest pc I have ~5 sec) for >100.000 hits (and
consecutive same queries a few hundred ms).  So apparently, it is
already quite a bit faster than I thought it was, but a query like 

//documents//*[@date] where there are millions of nodes will probably
still take more like a minute to complete. 

Furthermore I have been investigating the LuceneQueryBuilder
visit(LocationStepQueryNode node, Object data) in combination of
DescendantSelfAxisQuery, and must say that it is really sophisticated
IMO. ATM, I am not yet capable of understanding precisely how the
DescendantSelfAxisQuery does the job....This in combination that I also
have some other tasks....don't count me to have a solution before
friday!! :-) 

Ard

> 
> Cheers,
> Christoph
> 
> 

Mime
View raw message