lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: best practice: 1.4 billions documents
Date Fri, 26 Nov 2010 19:58:47 GMT
On Fri, Nov 26, 2010 at 12:49 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
> This is the problem for Fuzzy: each searcher expands the fuzzy query to a
> different Boolean Query and so the scores are not comparable - MultiSearcher
> (but not Solr) tries to combine the resulting rewritten queries into one
> query, so every searcher has the same query.

The problem is not actually any issue with FuzzyQuery, it is the
Query.combine() with any Boolean rewrite... including AUTO as i
mentioned earlier in this thread!

AUTO starts out building a boolean rewrite... if certain magical
conditions are hit (exceeds certain number of terms, or certain DF),
then it switches over to a Filter.

So, for example AUTO proclaims it will never hit boolean maxclauses
exceeded exception, but it can (imagine multisearcher with 5
searchers, it expands to 250 each, but then after combine() this is >
1024).

In my opinion Query.combine() is completely broken, and i don't see
how it can really be fixed to work with arbitrary query structures
since a query might rewrite() differently on the different searchers.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message