jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject RE: Search performance : MultiIndex
Date Tue, 30 Oct 2007 10:37:09 GMT

> Marcel Reutegger wrote:
> we should definitively try to execute such queries in a 
> reasonable time. I think this is a very common use case. can 
> you please create a jira issue? I'm not sure if the hierarchy 
> is the issue here or just the fact that lots of nodes need to 
> be ordered. do you have more insight on this?

I must admit I looked at it a few weeks ago, so I might be a little off,
but from the top of my head I think I understood some recursive filters
for DescendantSelfAxisWeight/ChildAxisQuery which check wether the
parents are correct. Lots of nodes sorting won't be a bottleneck (we
have lucene queries in other projects with >>100.000 docs which get the
last 10 docs sorted on modificationdate within couple of ms) (certainly
consecutive searches with the same IndexReaders with sorting will hardly
be a bottleneck).

I'll file a JIRA issue for it. Perhaps I have some time next week to
sort out the exact bottleneck. After finding the bottleneck, I am not
very sure wether the issue can be solved with cheap move operation *and*
cheap DescendantSelfAxisWeight. 

> maybe this approach can be turned into a clever hierarchical 
> cache? without the need to index the whole path with a node.

I think these problems can only be solved by having all criteria present
in the index. I do not directly see how a hierarchical cache could solve
the issue. Using this cache to filter afterwards will inherently again
result in slow results, beside the fact that the cache might become
pretty big to be effective. 

Anyway, I'll try to get back on the issue with some findings of the
current impl next week

Regards Ard

> regards
>   marcel

View raw message