lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: minpercentage vs. mincount
Date Wed, 02 Jun 2010 18:27:58 GMT

: Obviously I could implement this in userland (like like mincount for 
: that matter), but I wonder if anyone else see's use in being able to 
: define that a facet must match a minimum percentage of all documents in 
: the result set, rather than a hardcoded value? The idea being that while 
: I might not be interested in a facet that only covers 3 documents in the 
: result set if there are lets say 1000 documents in the result set, the 
: situation would be a lot different if I only have 10 documents in the 
: result set.

typically people deal with this type of situation by using facet.limit to 
ensure they only get the "top" N constraints back -- and they set 
facet.mincount to something low just to save bandwidth if all the 
counts are "too low to care about no matter how few results there are" 
(ie: 0)

: I did not yet see such a feature, would it make sense to file it as a 
: feature request or should stuff like this rather be done in userland (I 
: have noticed for example that Solr prefers to have users normalize the 
: scores in userland too)?

feel free to file a feature request -- truthfully this is kind of a hard 
problem to solve in userland, you'd either have to do two queries (the 
first to get the numFound, the second with facet.mincount set as an 
integer relative numFound) or you'd have to do a single query but ask for 
a "big" value for facet.limit and hope that you get enough to prune your 
list.

Off the top of my head though: i can't relaly think of a sane way to do 
this on the server side that would work with distributed search either -- 
but go ahead and open an issue and let's see what the folks who are really 
smart about the distributed searching stuff have to say.


-Hoss


Mime
View raw message