lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: finds all documents without a value for field
Date Thu, 20 Jul 2017 19:21:55 GMT
One other possibility is to create a second boolean field "has_terms"
or something and just add an fq clause like "&fq=has_terms:false"....

On Thu, Jul 20, 2017 at 12:00 PM, Shawn Heisey <apache@elyograg.org> wrote:
> On 7/20/2017 7:20 AM, Hendrik Haddorp wrote:
>> the Solr 6.6. ref guide states that to "finds all documents without a
>> value for field" you can use:
>> -field:[* TO *]
>>
>> While this is true I'm wondering why it is recommended to use a range
>> query instead of simply:
>> -field:*
>
> Performance.
>
> A wildcard is expanded to all possible term values for that field.  If
> the field has millions of possible terms, then the query object created
> at the Lucene level will quite literally have millions of terms in it.
> No matter how you approach a query with those characteristics, it's
> going to be slow, for both getting the terms list and executing the query.
>
> A full range query might be somewhat slow when there are many possible
> values, but it's a lot faster than a wildcard in those cases.
>
> If the field is only used by a handful of documents and has very few
> possible values, then it might be faster than a range query ... but this
> is not common, so the recommended way to do this is with a range query.
>
> Thanks,
> Shawn
>

Mime
View raw message