lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hendrik Haddorp <hendrik.hadd...@gmx.net>
Subject Re: finds all documents without a value for field
Date Thu, 20 Jul 2017 21:27:09 GMT
If the range query is so much better shouldn't the Solr query parser 
create a range query for a token query that only contains the wildcard? 
For the *:* case it does already contain a special path.

On 20.07.2017 21:00, Shawn Heisey wrote:
> On 7/20/2017 7:20 AM, Hendrik Haddorp wrote:
>> the Solr 6.6. ref guide states that to "finds all documents without a
>> value for field" you can use:
>> -field:[* TO *]
>>
>> While this is true I'm wondering why it is recommended to use a range
>> query instead of simply:
>> -field:*
> Performance.
>
> A wildcard is expanded to all possible term values for that field.  If
> the field has millions of possible terms, then the query object created
> at the Lucene level will quite literally have millions of terms in it.
> No matter how you approach a query with those characteristics, it's
> going to be slow, for both getting the terms list and executing the query.
>
> A full range query might be somewhat slow when there are many possible
> values, but it's a lot faster than a wildcard in those cases.
>
> If the field is only used by a handful of documents and has very few
> possible values, then it might be faster than a range query ... but this
> is not common, so the recommended way to do this is with a range query.
>
> Thanks,
> Shawn
>


Mime
View raw message