jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject RE: improving the scalability in searching
Date Tue, 21 Aug 2007 13:21:23 GMT

> Ard Schrijvers wrote:
> > So, WDOT about indexing properties in seperate lucene 
> Fields, and about
> > possibly indexing more information of one property.
>
> Marcel Reutegger wrote:
> Because the number of distinct property names in jackrabbit 
> is unlimited (think 
> of nt:unstructured nodes), this would lead to a great number 
> of files created by 
> lucene. for each field (this actually changed with version 
> 2.1) lucene creates a 
> separate file. That's basically the reason why I put them all 
> into one field. 
> See [1] and [2].

I already thought I must have been missing something...back in april 2005 I did not keep track
of JR jira issues :-) In [1], API changes 13 it indeed states what you are saying. 

> We should probably re-consider using a 1:1 mapping between 
> jcr property names 
> and lucene fields, since we also got rid of the norms with [3].

I think we have quite some more flexibility with a 1:1 mapping. I do not have the birds eye
overview of possible complications, but I did locally already implement some parts to have
a 1:1 mapping. I think it would make some classes redundant, and more default lucene queries
and classes can be used (things like MatchAllQuery and SharedFieldSortComparator might not
be needed anymore). If others agree on these changes, I think this would validate a new QueryHandler
because it is quite bit change AFAICS

> 
> > My experience with
> > lucene, is that indexing tactically, eases querying a lot, 
> and gains you lots
> > of performance. So, if you do agree on these changes, which 
> I can try to
> > build in Jackrabbit, then I think these changes might validate a new
> > QueryHandler class to be build aside the old one. WDOT?
> 
> I'm all for making the index better, however I'm a bit 
> skeptical when it comes 
> to virtual fields. This is not just an optimization but a new 
> jackrabbit 
> specific feature that we would introduce.

Yes, I understand your point, and it would indeed be a jackrabbit specific feature not backed
by jsr...I just wanted to have it :-) Perhaps I'll try to add it in a custom IndexingConfigurationImpl.
That should't be to hard.

Thanks for your explanations. I have some time to help creating the 1:1 mapping if others
agree on the change and you need help (or don't have time / want to do it :-) ).  
 
regards Ard

[1] http://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_1_0/CHANGES.txt

> 
> regards
>   marcel
> 
> [1] 
> http://lucene.apache.org/java/docs/fileformats.html#Normalizat
ion%20Factors
[2] http://issues.apache.org/jira/browse/JCR-106
[3] http://issues.apache.org/jira/browse/JCR-1042

Mime
View raw message