lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <>
Subject Re: best practice: 1.4 billions documents
Date Mon, 22 Nov 2010 17:28:34 GMT
On Mon, Nov 22, 2010 at 12:17 PM, Uwe Schindler <> wrote:
> The latest discussion was more about MultiReader vs. MultiSearcher.
> But you are right, 1.4 B documents is not easy to go, especially when you
> index grows and you get to the 2.1 B marker, then no MultiSearcher or
> whatever helps.
> On the other hand even distributed Solr has the same problems like
> MultiSearcher: scoring MultiTermQueries (Fuzzy) doesn't work correctly

Are you referring to the idf being local to the shard instead of
global to the whole colleciton?
Andrzej has a patch in the works, but it's not committed yet.

> negative MTQ clauses may produce wrong results if the query rewriting is
> done like in MultiSearcher (which is unsolveable broken and broken and
> broken and again broken for some queries as Boolean clauses - see DeMorgan
> laws).

I don't think this is a problem for Solr.  Queries are executed on
each shard as normal (no difference from a non-distributed query).


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message