lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Commented] (SOLR-12366) Avoid SlowAtomicReader.getLiveDocs -- it's slow
Date Wed, 16 May 2018 21:53:00 GMT


David Smiley commented on SOLR-12366:

* adds new {{SolrIndexSearcher.getLiveDocsBits()}} method that works like {{LeafReader.getLiveDocs}}
does.  I don't actually like the name of this method; IMO it ought to be simply {{getLiveDocs}}
but that conflicts with an existing one that I think ought to be named something like {{getLiveDocSet}}.
 Since these are internal methods I think just rename it but I'm okay with renaming in master.
 * affects SimpleFacets.getFacetTermEnumCounts (classic faceting), FacetFieldProcessorByEnumTermsStream
(JSON facets), UnInvertedField, GraphTermsQParser, JoinQParser, SolrIndexSearcher.getFirstMatch
 * In GraphTermsQParser I further noticed the non-SolrIndexSearcher fallback logic was broken
as it didn't check for a null liveDocs.  Will we ever even get to this code?  Any way, I
decided to replace these many lines with something simpler.

IMO some callers of {{SolrIndexSearcher.getSlowAtomicReader}} should change to use {{MultiFields}}
to avoid the temptation to have a LeafReader that has many slow methods.  I made this change
in SimpleFacets.getFacetTermEnumCounts.  This could be a follow-up issue.

IMO {{SolrIndexSearcher.getFirstMatch}} should be removed in lieu of \{{lookupId}} so there's
less code to maintain.  Admittedly the latter is more verbose but we could add a utility
method for callers who don't care about the segment ordinal and only want the global ID.

[] could you please review?  This touches stuff you have been involved


> Avoid SlowAtomicReader.getLiveDocs -- it's slow
> -----------------------------------------------
>                 Key: SOLR-12366
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: search
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Major
>         Attachments: SOLR-12366.patch, SOLR-12366.patch
> SlowAtomicReader is of course slow, and it's getLiveDocs (based on MultiBits) is slow
as it uses a binary search for each lookup.  There are various places in Solr that use SolrIndexSearcher.getSlowAtomicReader
and then get the liveDocs.  Most of these places ought to work with SolrIndexSearcher's getLiveDocs

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message