lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: Function query matching
Date Sat, 07 Dec 2013 06:45:10 GMT

I had to do a double take when i read this sentence...

: Even with any improvements to 'scale', all function queries will add a
: linear increase to the Qtime as index size increases, since they match all
: docs.

...because that smelled like either a bug in your methodology, or a bug in 
Solr.  To convince myself there wasn't a bug in Solr, i wrote a test case 
(i'll commit tomorow, bunch of churn in svn right now making "ant 
precommit" unhappy) to prove that when wrapping boost functions arround 
queries, Solr will only evaluate the functions for docs matching the 
wrapped query -- so there is no linear increase as the index size 
increases, just the (neccessary) libera increase as the number of 
*matching* docs grows. (for most functions anyway -- as mentioned "scale" 
is special).

BUT! ... then i remembered how this thread started, and your goal of 
"scaling" the scores from a wrapped query.

I want to be clear for 99% of the people reading this, if you find 
yourself writting a query structure like this...

  q={!func}..functions involving wrapping $qq ...
 qq={!edismax ...lots of stuff but still only matching subset of the index...}
 fq={!query v=$qq}

...Try to restructure the match you want to do into the form of a 
multiplier

  q={!boost b=$b v=$qq}
  b=...functions producing a score multiplier...
 qq={!edismax ...lots of stuff but still only matching subset of the index...}

Because the later case is much more efficient and Solr will only compute 
the function values for hte docs it needs to (that match the wrapped $qq 
query)

But for your specific goal Peter: Yes, if the whole point of a function 
you have is to wrap generated a "scaled" score of your base $qq, then the 
function (wrapping the scale(), wrapping the query()) is going to have to 
be evaluated for every doc -- that will definitely be linear based on the 
size of the index.



-Hoss
http://www.lucidworks.com/

Mime
View raw message