lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Salman Ansari <salman.rah...@gmail.com>
Subject Re: Negating multiple array fileds
Date Wed, 17 Feb 2016 07:34:14 GMT
Thanks Shawn for explaining in details.
Regarding the performance issue you mentioned, there are 2 points

1) "The [* TO *] syntax is an all-inclusive range query, which will usually be
much faster than a wildcard query."

I will take your statement for granted and let the space for people to
comment on the details behind this.

2) "Behind the scenes, Solr will interpret this as "all possible values for
field" --which sounds like it would be exactly what you're looking for,
except that if there are ten million possible values in the field
you're searching,
the constructed Lucene query will quite literally include all ten million
values."

Does that mean that the  [* TO *] syntax does not return all results?

Regards,

Salman
On Feb 17, 2016 6:29 AM, "Binoy Dalal" <binoydalal93@gmail.com> wrote:

> Hi Shawn,
> Please correct me If I'm wrong here, but don't the all inclusive range
> query [* TO *] and an only wildcard query like the one above essentially do
> the same thing from a black box perspective?
> In such a case wouldn't it be better to default an only wildcard query to
> an all inclusive range query?
>
> On Wed, 17 Feb 2016, 06:47 Shawn Heisey <apache@elyograg.org> wrote:
>
> > On 2/15/2016 9:22 AM, Jack Krupansky wrote:
> > > I should also have noted that your full query:
> > >
> > > (-persons:*)AND(-places:*)AND(-orgs:*)
> > >
> > > can be written as:
> > >
> > > -persons:* -places:* -orgs:*
> > >
> > > Which may work as is, or can also be written as:
> > >
> > > *:* -persons:* -places:* -orgs:*
> >
> > Salman,
> >
> > One fact of Lucene operation is that purely negative queries do not
> > work.  A negative query clause is like a subtraction.  If you make a
> > query that only says "subtract these values", then you aren't going to
> > get anything, because you did not start with anything.
> >
> > Adding the "*:*" clause at the beginning of the query says "start with
> > everything."
> >
> > You might ask why a query of -field:value works, when I just said that
> > it *won't* work.  This is because Solr has detected the problem and
> > fixed it.  When the query is very simple (a single negated clause), Solr
> > is able to detect the unworkable situation and implicitly add the "*:*"
> > starting point, producing the expected results.  With more complex
> > queries, like the one you are trying, this detection fails, and the
> > query is executed as-is.
> >
> > Jack is an awesome member of this community.  I do not want to disparage
> > him at all when I tell you that the rewritten query he provided will
> > work, but is not optimal.  It can be optimized as the following:
> >
> > *:* -persons:[* TO *] -places:[* TO *] -orgs:[* TO *]
> >
> > A query clause of the format "field:*" is a wildcard query.  Behind the
> > scenes, Solr will interpret this as "all possible values for field" --
> > which sounds like it would be exactly what you're looking for, except
> > that if there are ten million possible values in the field you're
> > searching, the constructed Lucene query will quite literally include all
> > ten million values.  Wildcard queries tend to use a lot of memory and
> > run slowly.
> >
> > The [* TO *] syntax is an all-inclusive range query, which will usually
> > be much faster than a wildcard query.
> >
> > Thanks,
> > Shawn
> >
> > --
> Regards,
> Binoy Dalal
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message