lucene-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Gerlowski (Jira)" <>
Subject [jira] [Commented] (LUCENE-9025) Add more efficient lookupTerm() overload to SortedSetDocValues
Date Mon, 04 Nov 2019 13:24:00 GMT


Jason Gerlowski commented on LUCENE-9025:

Ok, fair enough.  I'll close this out then and wait for work to progress on LUCENE-8836.

> Add more efficient lookupTerm() overload to SortedSetDocValues
> --------------------------------------------------------------
>                 Key: LUCENE-9025
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>    Affects Versions: master (9.0)
>            Reporter: Jason Gerlowski
>            Priority: Minor
>         Attachments: LUCENE-9025.patch
> {{SortedSetDocValues.lookupTerm(BytesRef)}} performs a binary search of the entire docValues
range to find the ordinal of the requested BytesRef.
> For an individual invocation, this is optimal.  Without other context, binary search
needs to cover the entire space.
> But there are some common uses of {{lookupTerm}} where this shouldn't be necessary. 
For example: making multiple {{lookupTerm}} calls to fetch the ordinals for each value in
a sorted list of terms.  {{lookupTerm}} will binary-search the whole space on each invocation,
even though the caller knows that there's no point searching anything before the ordinal that
came back from the previous {{lookupTerm}} call.
> I propose we add a {{SortedSetDocValues.lookupTerm}} overload which takes a lower-bound
to start the binary search at: {{public long lookupTerm(BytesRef key, long lowerSearchBound)
throws IOException}}  This saves each binary-search a few iterations in usage scenarios like
the one described above, which can conceivably add up.

This message was sent by Atlassian Jira

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message