jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Kiehl ...@sulu3000.de>
Subject Re: improving the scalability in searching
Date Wed, 15 Aug 2007 07:29:02 GMT
Marcel Reutegger wrote:
> Ard Schrijvers wrote:
>> IMO, we should index more (derived) data about a documents properties 
>> (I'll
>> return to this in a mail about IndexingConfiguration which I think we 
>> can add
>> some features that might tackle this) if we want to be able to query 
>> fast.
>> For this specific problem, the solution would be very simple:
>> I suggest to add
>> /** * Name of the field that contains all available properties that 
>> present
>> for a certain node */ public static final String PROPERTIES_SET =
>> "_:PROPERTIES_SET".intern();
>> and when indexing a node, each property name of that node is added to its
>> index (few lines of code in NodeIndexer):
>> Then, when searching for all nodes that have a property, is one single
>> docs.seek(terms); and set the docFilter. This approach scales to 
>> millions of
>> documents easily with times close to 0 ms. WDOT? Ofcourse, I can 
>> implement
>> this in the trunk.
> I agree with you that the current implementation is not optimized for 
> queries that check the existence of a property. Your proposed solution 
> seems reasonable, I would implement it the same way. There's just one 
> minor obstacle, how do we implement this change in a backward compatible 
> way? an existing index without this additional field should still work.

We could use IndexReader.getFieldNames() at startup to check if such a 
field already exists which means we have an index in the new format and 
then use this information in MatchAllScorer to decide which 
implementation to use.


View raw message