jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcel Reutegger <marcel.reuteg...@gmx.net>
Subject Re: improving the scalability in searching
Date Mon, 13 Aug 2007 12:58:07 GMT
Ard Schrijvers wrote:
> IMO, we should index more (derived) data about a documents properties (I'll
> return to this in a mail about IndexingConfiguration which I think we can add
> some features that might tackle this) if we want to be able to query fast.
> For this specific problem, the solution would be very simple:
> 
> I suggest to add
> 
> /** * Name of the field that contains all available properties that present
> for a certain node */ public static final String PROPERTIES_SET =
> "_:PROPERTIES_SET".intern();
> 
> and when indexing a node, each property name of that node is added to its
> index (few lines of code in NodeIndexer):
> 
> Then, when searching for all nodes that have a property, is one single
> docs.seek(terms); and set the docFilter. This approach scales to millions of
> documents easily with times close to 0 ms. WDOT? Ofcourse, I can implement
> this in the trunk.

I agree with you that the current implementation is not optimized for queries 
that check the existence of a property. Your proposed solution seems reasonable, 
I would implement it the same way. There's just one minor obstacle, how do we 
implement this change in a backward compatible way? an existing index without 
this additional field should still work.

regards
  marcel

Mime
View raw message