lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dennis Gearon <gear...@sbcglobal.net>
Subject Re: Searching for negative numbers very slow
Date Thu, 17 Feb 2011 07:55:24 GMT
Is it my imagination or has this exact email been on the list already?

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.




________________________________
From: Chris Hostetter <hossman_lucene@fucit.org>
To: solr-user@lucene.apache.org
Cc: yonik@lucidimagination.com
Sent: Wed, February 16, 2011 6:20:28 PM
Subject: Re: Searching for negative numbers very slow


: This was my first thought but -1 is relatively common but we have other 
: numbers just as common. 

i assume that when you say that you mean "...we have other numbers 
(that are not negative) just as common, (but searching for them is much 
faster)" ?

I don't have any insight into why your negative numbers are slower, but 
FWIW...

: Interestingly enough
: 
: fq=uid:-1
: fq=foo:bar
: fq=alpha:omega
: 
: is much (4x) slower than
: 
: q="uid:-1 AND foo:bar AND alpha:omega"

...this is (in and of itself) not that suprising for any three arbitrary 
disjoint queries.  when a BoleanQuery is a full disjunction like this (all 
clause required) it can efficiently skip scoring a lot of documents by 
looping over the clauses, asking each one for the "next" doc they 
match, and then leap frogging the other clauses to that doc.  in the case 
of the three "fq" params, each query is executd in isolatin, and *all* of 
the matches of each is accounted for.

the speed of using distinct "fq" params in situations like this comes from 
the reuse after they are in the filterCache -- you can change fq=foo:bar 
to fq=foo:baz on the next query, and still reuse 2/3 of the work that was 
done on the first query. likewise if hte next query is 
fq=uid:-1&fq=foo:bar&fq=alpha:omegabeta then 2/3 of the work is already 
done again, and if a following query is 
fq=uid:-1&fq=foo:baz&fq=alpha:omegabeta then all of the work is already 
done and cached even though that particular request has never been seen by 
solr.


-Hoss

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message