lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <yo...@lucidimagination.com>
Subject Re: best practice: 1.4 billions documents
Date Mon, 22 Nov 2010 17:28:34 GMT
On Mon, Nov 22, 2010 at 12:17 PM, Uwe Schindler <uwe@thetaphi.de> wrote:
> The latest discussion was more about MultiReader vs. MultiSearcher.
>
> But you are right, 1.4 B documents is not easy to go, especially when you
> index grows and you get to the 2.1 B marker, then no MultiSearcher or
> whatever helps.
>
> On the other hand even distributed Solr has the same problems like
> MultiSearcher: scoring MultiTermQueries (Fuzzy) doesn't work correctly

Are you referring to the idf being local to the shard instead of
global to the whole colleciton?
Andrzej has a patch in the works, but it's not committed yet.

> negative MTQ clauses may produce wrong results if the query rewriting is
> done like in MultiSearcher (which is unsolveable broken and broken and
> broken and again broken for some queries as Boolean clauses - see DeMorgan
> laws).

I don't think this is a problem for Solr.  Queries are executed on
each shard as normal (no difference from a non-distributed query).

-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message