lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Bernstein <joels...@gmail.com>
Subject Re: Configurable collectors for custom ranking
Date Thu, 12 Dec 2013 16:13:53 GMT
The sorting is going to happen in the lower level collectors. You need a
value source that returns the score of the document being collected.

Here is how you can make this happen:

1) Create an object in your PostFilter that simply holds the current score.
Place this object in the SearchRequest context map. Update object.score as
you pass the docs and scores to the lower collectors.

2) Create a values source that checks the SearchRequest context for the
object that's holding the current score. Use this object to return the
current score when called. For example if you give the value source a
handle called "score" a compound function call will look like this:
sum(score(), field(x))

Joel










On Thu, Dec 12, 2013 at 9:58 AM, Peter Keegan <peterlkeegan@gmail.com>wrote:

> Regarding my original goal, which is to perform a math function using the
> scaled score and a field value, and sort on the result, how does this fit
> in? Must I implement another custom PostFilter with a higher cost than the
> scale PostFilter?
>
> Thanks,
> Peter
>
>
> On Wed, Dec 11, 2013 at 4:01 PM, Peter Keegan <peterlkeegan@gmail.com
> >wrote:
>
> > Thanks very much for the guidance. I'd be happy to donate a working
> > solution.
> >
> > Peter
> >
> >
> > On Wed, Dec 11, 2013 at 3:53 PM, Joel Bernstein <joelsolr@gmail.com
> >wrote:
> >
> >> SOLR-5020 has the commit info, it's mainly changes to SolrIndexSearcher
> I
> >> believe. They might apply to 4.3.
> >> I think as long you have the finish method that's all you'll need. If
> you
> >> can get this working it would be excellent if you could donate back the
> >> Scale PostFilter.
> >>
> >>
> >> On Wed, Dec 11, 2013 at 3:36 PM, Peter Keegan <peterlkeegan@gmail.com
> >> >wrote:
> >>
> >> > This is what I was looking for, but the DelegatingCollector 'finish'
> >> method
> >> > doesn't exist in 4.3.0 :(   Can this be patched in and are there any
> >> other
> >> > PostFilter dependencies on 4.5?
> >> >
> >> > Thanks,
> >> > Peter
> >> >
> >> >
> >> > On Wed, Dec 11, 2013 at 3:16 PM, Joel Bernstein <joelsolr@gmail.com>
> >> > wrote:
> >> >
> >> > > Here is one approach to use in a postfilter
> >> > >
> >> > > 1) In the collect() method call score for each doc. Use the scores
> to
> >> > > create your scaleInfo.
> >> > > 2) Keep a bitset of the hits and a priorityQueue of your top X
> >> ScoreDocs.
> >> > > 3) Don't delegate any documents to lower collectors in the collect()
> >> > > method.
> >> > > 4) In the finish method create a score mapping (use the hppc
> >> > > IntFloatOpenHashMap) with your top X docIds pointing to their score,
> >> > using
> >> > > the priorityQueue created in step 2. Then iterate the bitset (also
> >> > created
> >> > > in step 2) sending down each doc to the lower collectors, retrieving
> >> and
> >> > > scaling the score from the score map. If the document is not in the
> >> score
> >> > > map then send down 0.
> >> > >
> >> > > You'll have setup a dummy scorer to feed to lower collectors. The
> >> > > CollapsingQParserPlugin has an example of how to do this.
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On Wed, Dec 11, 2013 at 2:05 PM, Peter Keegan <
> peterlkeegan@gmail.com
> >> > > >wrote:
> >> > >
> >> > > > Hi Joel,
> >> > > >
> >> > > > I thought about using a PostFilter, but the problem is that the
> >> 'scale'
> >> > > > function must be done after all matching docs have been scored
but
> >> > before
> >> > > > adding them to the PriorityQueue that sorts just the rows to
be
> >> > returned.
> >> > > > Doing the 'scale' function wrapped in a 'query' is proving to
be
> too
> >> > slow
> >> > > > when it visits every document in the index.
> >> > > >
> >> > > > In the Collector, I can see how to get the field values like
this:
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> indexSearcher.getSchema().getField("field(myfield").getType().getValueSource(SchemaField,
> >> > > > QParser).getValues()
> >> > > >
> >> > > > But, 'getValueSource' needs a QParser, which isn't available.
> >> > > > And I can't create a QParser without a SolrQueryRequest, which
> isn't
> >> > > > available.
> >> > > >
> >> > > > Thanks,
> >> > > > Peter
> >> > > >
> >> > > >
> >> > > > On Wed, Dec 11, 2013 at 1:48 PM, Joel Bernstein <
> joelsolr@gmail.com
> >> >
> >> > > > wrote:
> >> > > >
> >> > > > > Peter,
> >> > > > >
> >> > > > > It sounds like you could achieve what you want to do in
a
> >> PostFilter
> >> > > > rather
> >> > > > > then extending the TopDocsCollector. Is there a reason why
a
> >> > PostFilter
> >> > > > > won't work for you?
> >> > > > >
> >> > > > > Joel
> >> > > > >
> >> > > > >
> >> > > > > On Tue, Dec 10, 2013 at 3:24 PM, Peter Keegan <
> >> > peterlkeegan@gmail.com
> >> > > > > >wrote:
> >> > > > >
> >> > > > > > Quick question:
> >> > > > > > In the context of a custom collector, how does one
get the
> >> values
> >> > of
> >> > > a
> >> > > > > > field of type 'ExternalFileField'?
> >> > > > > >
> >> > > > > > Thanks,
> >> > > > > > Peter
> >> > > > > >
> >> > > > > >
> >> > > > > > On Tue, Dec 10, 2013 at 1:18 PM, Peter Keegan <
> >> > > peterlkeegan@gmail.com
> >> > > > > > >wrote:
> >> > > > > >
> >> > > > > > > Hi Joel,
> >> > > > > > >
> >> > > > > > > This is related to another thread on function
query
> matching (
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://lucene.472066.n3.nabble.com/Function-query-matching-td4099807.html#a4105513
> >> > > > > > ).
> >> > > > > > > The patch in SOLR-4465 will allow me to extend
> >> TopDocsCollector
> >> > and
> >> > > > > > perform
> >> > > > > > > the 'scale' function on only the documents matching
the main
> >> > dismax
> >> > > > > > query.
> >> > > > > > > As you mention, it is a slightly intrusive design
and
> requires
> >> > > that I
> >> > > > > > > manage my own PriorityQueue (and a local duplicate
of
> >> HitQueue),
> >> > > but
> >> > > > > > should
> >> > > > > > > work. I think a better design would hide the PQ
from the
> >> plugin.
> >> > > > > > >
> >> > > > > > > Thanks,
> >> > > > > > > Peter
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Sun, Dec 8, 2013 at 5:32 PM, Joel Bernstein
<
> >> > joelsolr@gmail.com
> >> > > >
> >> > > > > > wrote:
> >> > > > > > >
> >> > > > > > >> Hi Peter,
> >> > > > > > >>
> >> > > > > > >> I've been meaning to revisit configurable
ranking
> collectors,
> >> > but
> >> > > I
> >> > > > > > >> haven't
> >> > > > > > >> yet had a chance. It's on the shortlist of
things I'd like
> to
> >> > > tackle
> >> > > > > > >> though.
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> On Fri, Dec 6, 2013 at 4:17 PM, Peter Keegan
<
> >> > > > peterlkeegan@gmail.com>
> >> > > > > > >> wrote:
> >> > > > > > >>
> >> > > > > > >> > I looked at SOLR-4465 and SOLR-5045,
where it appears
> that
> >> > there
> >> > > > is
> >> > > > > a
> >> > > > > > >> goal
> >> > > > > > >> > to be able to do custom sorting and ranking
in a
> >> PostFilter.
> >> > So
> >> > > > far,
> >> > > > > > it
> >> > > > > > >> > looks like only custom aggregation can
be implemented in
> >> > > > PostFilter
> >> > > > > > >> (5045).
> >> > > > > > >> > Custom sorting/ranking can be done in
a pluggable
> collector
> >> > > > (4465),
> >> > > > > > but
> >> > > > > > >> > this patch is no longer in dev.
> >> > > > > > >> >
> >> > > > > > >> > Is there any other dev. being done on
adding custom
> sorting
> >> > > (after
> >> > > > > > >> > collection) via a plugin?
> >> > > > > > >> >
> >> > > > > > >> > Thanks,
> >> > > > > > >> > Peter
> >> > > > > > >> >
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >>
> >> > > > > > >> --
> >> > > > > > >> Joel Bernstein
> >> > > > > > >> Search Engineer at Heliosearch
> >> > > > > > >>
> >> > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Joel Bernstein
> >> > > > > Search Engineer at Heliosearch
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > Joel Bernstein
> >> > > Search Engineer at Heliosearch
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> Joel Bernstein
> >> Search Engineer at Heliosearch
> >>
> >
> >
>



-- 
Joel Bernstein
Search Engineer at Heliosearch

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message