lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mikhail Khludnev <m...@apache.org>
Subject Re: block min-max values for Sort Field with Top-N query..
Date Tue, 02 Jul 2019 13:11:34 GMT
Not sure what is the problem, but make sure you are aware of
https://lucene.apache.org/solr/guide/7_0/function-queries.html#childfield-field-function
.

On Tue, Jul 2, 2019 at 4:01 PM Ravikumar Govindarajan <
ravikumar.govindarajan@gmail.com> wrote:

> Our Sort Fields utilize DocValues..
>
> Lets say I collect min-max ords of a Sort Field for a block of documents
> (128, 256 etc..) at index-time via Codec & store it as part of DocValues at
> a Segment level..
>
> During query time, could we take advantage of this Stats when Top-N query
> with Sort Field is requested?
>
> Typically, what I had in mind is a SortStats class with the following
> method
>
> int *seek*(int *max-doc-seen-till-now*, int *min-sort-ord-seen-till-now*,
> boolean sortDesc) {
>   // 1. Fetch the doc-ranges that has >=
> *min-sort-ord-seen-till-now*
> *  // 2. *Return the least doc-range >= *max-doc-seen-till-now *(If
> SortDesc=true)
> *         Return the least doc-range <= max-doc-seen-till-now *(If
> SortDesc=false)
> }
>
> Top-N Collector can keep track of the *max-doc-seen-till-now &
> min-sort-ord-seen-till-now *variable during query time & then call the
> *SortStats.seek()* for a possible skip of blocks of documents that may
> otherwise be needlessly offered & popped out from the priority queue
>
> I understand this simplistic logic depends on sort-field data distribution
> & won't work for multi-sort field queries or out-of-order scoring etc..
>
> But, in general will this be a good idea to explore or something that is
> best not attempted?
>
> Any help is much appreciated
>
> --
> Ravi
>


-- 
Sincerely yours
Mikhail Khludnev

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message