lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Commented] (SOLR-9501) Collapse filter should sometimes be cacheable instead of never
Date Sun, 11 Sep 2016 20:33:20 GMT


David Smiley commented on SOLR-9501:

I don't think a marker interface is the right design.  It's only possible for a PostFilter
to use the score (right?), and so perhaps a method should be added to PostFilter.  PostFilter
has one method getFilterCollector(IndexSearcher) to return a Collector.  Lucene Collector.needsScores()
exists so perhaps we need nothing new after all?  Although getting the collector is not necessarily
trivial only to throw away the result just to see what needsScores() returns.  

Perhaps the related SolrIndexSearcher logic to support this should actually be in processFilter()?
 getDocSet would then need to work with a Scorer somehow... (even if sometimes it's a dummy
one) so it can call setScorer on each segment when it collects.

Related to this, CollapsingPostFilter.getCache() is hard-coded to return false.  Perhaps a
non-cacheable query

Any way, I'm punting on this for now.

> Collapse filter should sometimes be cacheable instead of never
> --------------------------------------------------------------
>                 Key: SOLR-9501
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: David Smiley
> When SolrIndexSearcher.getDocSet(List<Query>queries) is called, it first checks
if any implement the marker interface ScoreFilter, and if so it calls out to getDocSetScore\[1]
instead of continuing.  CollapsingPostFilter is the only Query implementing ScoreFilter. 
There is a presumption here that any CollapsingPostFilter needs the score.  But this just
isn't true; you can collapse with a min/max/sort on something that doesn't need the score.
 So there is a needless performance hit here.
>  \[1] I don't like that getDocSetScore presumes the first query in the list is the scoring
one -- it's a poor API contract relationship; at a minimum the javadocs should be updated.
 This holds for getDocSet as well since it passes through. Perhaps getDocSet could be modified
to take a nullable scoring Query first arg.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message