lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-11769) Sorting performance degrades when useFilterForSortedQuery is enabled and there is no filter query specified
Date Wed, 28 Feb 2018 17:39:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380719#comment-16380719
] 

ASF subversion and git services commented on SOLR-11769:
--------------------------------------------------------

Commit ef989124f345af46a905d1196bc589ef37b221c9 in lucene-solr's branch refs/heads/master
from [~dsmiley]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=ef98912 ]

SOLR-11769: optimize useFilterForSortedQuery=true when no filter queries


> Sorting performance degrades when useFilterForSortedQuery is enabled and there is no
filter query specified
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11769
>                 URL: https://issues.apache.org/jira/browse/SOLR-11769
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>    Affects Versions: 4.10.4
>         Environment: OS: macOS Sierra (version 10.12.4)
> Memory: 16GB
> CPU: 2.9 GHz Intel Core i7
> Java Version: 1.8
>            Reporter: Betim Deva
>            Assignee: David Smiley
>            Priority: Major
>              Labels: performance
>         Attachments: SOLR-11769_Optimize_MatchAllDocsQuery_more.patch
>
>
> The performance of sorting degrades significantly when the {{useFilterForSortedQuery}}
is enabled, and there's no filter query specified.
> *Steps to Reproduce:*
> 1. Set {{useFilterForSortedQuery=true}} in {{solrconfig.xml}}
> 2. Run a  query to match and return a single document. Also add sorting
> - Example {{/select?q=foo:123&sort=bar+desc}}
> Having a large index (> 10 million documents), this yields to a slow response (a few
hundreds of milliseconds on average) even when the resulting set consists of a single document.
> *Observation 1:*
> - Disabling {{useFilterForSortedQuery}} improves the performance to < 1ms
> *Observation 2:*
> - Removing the {{sort}} improves the performance to < 1ms
> *Observation 3:*
> - Keeping the {{sort}}, and adding any filter query (such as {{fq=\*:\*}}) improves the
performance to < 1 ms.
> After profiling [SolrIndexSearcher.java|https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;a=blob;f=solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java;h=9ee5199bdf7511c70f2cc616c123292c97d36b5b;hb=HEAD#l1400]
found that the bottleneck is on 
> {{DocSet bigFilt = getDocSet(cmd.getFilterList());}} 
> when {{cmd.getFilterList())}} is passed in as {{null}}. This is making {{getDocSet()}}
function collect document ids every single time it is called without any caching.
> {code:java}
> 1394     if (useFilterCache) {
> 1395       // now actually use the filter cache.
> 1396       // for large filters that match few documents, this may be
> 1397       // slower than simply re-executing the query.
> 1398       if (out.docSet == null) {
> 1399         out.docSet = getDocSet(cmd.getQuery(), cmd.getFilter());
> 1400         DocSet bigFilt = getDocSet(cmd.getFilterList());
> 1401         if (bigFilt != null) out.docSet = out.docSet.intersection(bigFilt);
> 1402       }
> 1403       // todo: there could be a sortDocSet that could take a list of
> 1404       // the filters instead of anding them first...
> 1405       // perhaps there should be a multi-docset-iterator
> 1406       sortDocSet(qr, cmd);
> 1407     }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message