lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Commented] (SOLR-11865) Refactor QueryElevationComponent to prepare query subset matching
Date Fri, 04 May 2018 22:05:00 GMT


David Smiley commented on SOLR-11865:

Patch still in progress but want to mention some things.
 * New name {{useConfiguredOrderForElevations}}.  Documentation language: "When multiple
docs are elevated, should their relative order be the order in the configuration file or should they
be subject to whatever the sort criteria is? True by default."
 * Found a way to entirely skip using ElevationComparatorSource if the Elevation obj has
no elevations (maybe just has exclusions)
 * I was looking in detail at ElevationComparatorSource that led me to some observations that
I'd like your input on:
 ** BytesRef[] termValues is the "value half of the map".  It's the BytesRef version of the
ID values aligned with ordSet (doc ID) slots.  But the reader of these (docVal()) has to
do additional work to look it up in elevation.priorities to get an int.  I think this could
be replaced with an int[] populated with the pertinent int priorities when doSetNextReader
is called (which is where ordSet & termValues is init'ed right now).  This int[] would
be named simply priorities.
 ** Elevation.elevatedIds could be a Map<String,Integer> that maps directly to the priority
from the uniqueKey val (thus removing the need for a separate "priorities" map), and then
in doSetNextReader we can iterate on the Map.Entry and needn't do another lookup.
 ** I wonder if the String IDs in Elevation, both elevated and excluded, ought to be BytesRefs
to clarify that they are raw indexed form IDs?  (consider when uniqueKey is a long)  The
current String form is suggestive that they are the surface form IDs, yet they aren't since
they've already been mapped with FieldType.readableToIndexed.  Or alternatively keep the
surface form IDs and translate them at a later time.  I think we might as well do them eagerly
as it saves work during search, even if it's easy work, and again it clarifies the type.
 *** FieldType.readableToIndexed(String) ought to be deprecated in lieu of readableToIndexed(CharSequence,
 ** I guess it's debatable where to actually apply the key String => indexed form (String
of BytesRef)... we're doing it in Elevation's constructor with a passed in UnaryOperator thingy
but it could just as easily be done very late in, say, ElevationComparatorSource.doSetNextReader,
or perhaps very early right after we read it from the XML. I suppose it's fine as-is.

> Refactor QueryElevationComponent to prepare query subset matching
> -----------------------------------------------------------------
>                 Key: SOLR-11865
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SearchComponents - other
>    Affects Versions: master (8.0)
>            Reporter: Bruno Roustant
>            Priority: Minor
>              Labels: QueryComponent
>             Fix For: master (8.0)
>         Attachments: 0001-Refactor-QueryElevationComponent-to-introduce-Elevat.patch,
0002-Refactor-QueryElevationComponent-after-review.patch, 0003-Remove-exception-handlers-and-refactor-getBoostDocs.patch,
> The goal is to prepare a second improvement to support query terms subset matching or
query elevation rules.
> Before that, we need to refactor the QueryElevationComponent. We make it extendible.
We introduce the ElevationProvider interface which will be implemented later in a second patch
to support subset matching. The current full-query match policy becomes a default simple MapElevationProvider.
> - Add overridable methods to handle exceptions during the component initialization.
> - Add overridable methods to provide the default values for config properties.
> - No functional change beyond refactoring.
> - Adapt unit test.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message