lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: Statically store sub-collections for search (faceted search?)
Date Mon, 15 Apr 2013 19:42:52 GMT
Hi Uwe,

Thanks for the info, I was under the impression that it didn't... I got this info (that filters
don't have a limit because they are not scoring) from a document like the one below. Can't
say this is the exact doc because its been a while since I saw that, though.

As a response to this performance pitfall on very large indices’s (and the infamous TooManyClauses
exception), new queries were developed that relied on a new Query class called ConstantScoreQuery.
ConstantScoreQuerys accept a filter of matching documents and then score with a constant value
equal to the boost. Depending on the qualities of your index, this method can be faster than
the Boolean expansion method, and more importantly, does not suffer from TooManyClauses exceptions.
Rather than matching and scoring n BooleanQuery clauses (potentially thousands of clauses),
a single filter is enumerated and then traveled for scoring. On the other hand, constructing
and scoring with a BooleanQuery containing a few clauses is likely to be much faster than
constructing and traveling a Filter.


On Apr 15, 2013, at 1:04 AM, Uwe Schindler wrote:

> The limit also applies for filters. If you have a list of terms ORed together, the fastest
way is not to use a BooleanQuery at all, but instead a TermsFilter (which has no limits).
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> eMail:
>> -----Original Message-----
>> From: Carsten Schnober []
>> Sent: Monday, April 15, 2013 9:53 AM
>> To:
>> Subject: Re: Statically store sub-collections for search (faceted search?)
>> Am 12.04.2013 20:08, schrieb SUJIT PAL:
>>> Hi Carsten,
>>> Why not use your idea of the BooleanQuery but wrap it in a Filter instead?
>> Since you are not doing any scoring (only filtering), the max boolean clauses
>> limit should not apply to a filter.
>> Hi Sujit,
>> thanks for your suggestion! I wasn't aware that the max clause limit does not
>> match for a BooleanQuery wrapped in a filter. I suppose the ideal way would
>> be to use a BooleanFilter but not a QueryWrapperFilter, right?
>> However, I am also not sure how to apply a filter in my use case because I
>> perform a SpanQuery. Although SpanQuery#getSpans() does take a Bits
>> object as an argument (acceptDocs), I haven't been able to figure out how to
>> generate this Bits object correctly from a Filter object.
>> Best,
>> Carsten
>> --
>> Institut für Deutsche Sprache |
>> Projekt KorAP                 |
>> Tel. +49-(0)621-43740789      |
>> Korpusanalyseplattform der nächsten Generation Next Generation Corpus
>> Analysis Platform
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message