lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Thrysøe <...@conscius.com>
Subject PrefixQuery and hieracical queries problem
Date Fri, 19 Mar 2004 10:39:16 GMT
Hi,

I'm seeking any kind of advice that I can find to solve a problem I've 
run into with using lucene.

I'm integrating lucene as an alternative to other methods of indexing 
and searching that already exist in our product. Therefore it would be 
best if I could make the integration of lucene live up to the existing 
requirements.

What is indexed as lucene documents is structured in a tree (just like 
files in a filesystem), and the feature that I am working on is 
restricting a search to a certain part of this tree.

To implement this I used a PrefixQuery with the path to the folder to 
search below. Since the PrefixQuery creates a boolean query with a 
clause for each mathching term, this is a problem if there are more than 
1024 subfolders below the selected folder.

One way of getting around this would be if maxClauseCount could be set 
for a PrefixQuery, but there are problems with this.

Picking a number for this would be hard. In order to support very large 
installations a value of a million or so would have to be used. This 
would probably not perform very well.

The only alternative I can think of would be to store a whitespace 
seperated list of all ancestors along with a document:

/foo /foo/bar /foo/bar/baz

But this has two drawbacks: Index storage space used, and the cost of 
indexing (finding all ancestors).

So my question boils down to: Are there any alternatives to solve this 
scenario in an efficient way?


Thanks in advance,

Dennis Thrysøe



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message