lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <ysee...@gmail.com>
Subject Re: Improving sort performance
Date Sat, 22 Oct 2005 18:27:54 GMT
FunctionQuery matches all documents, so you normally want to use it as part
of a BooleanQuery with another mandatory clause. That will cause only
documents matching the other clause to be scored (the BooleanScorer takes
care of that logic).

The score FunctionQuery produces is from the function alone (no relevancy
stuff like idf, tf, lengthNorm, or anything else).

If you want to sort by that score alone, then boost the other parts of the
query to 0.

So, (MyQuery, sorted by MyFunkySort), becomes
((+MyQuery^0 MyFunctionQuery), sorted by score)

-Yonik
Now hiring -- http://forms.cnet.com/slink?231706

On 10/22/05, Jeff Rodenburg <jeff.rodenburg@gmail.com> wrote:
>
> This is really interesting, I haven't revved our code to this version yet.
> Does the score returned by FunctionQuery supersede underlying relevance
> scoring or is it rolled in at some base class?
>
> -- j
>
> On 10/22/05, Yonik Seeley <yseeley@gmail.com> wrote:
> >
> > I'm not sure what type of score you are trying to do, but maybe
> > FunctionQuery would help.
> > http://issues.apache.org/jira/browse/LUCENE-446
> >
> > -Yonik
> > Now hiring -- http://forms.cnet.com/slink?231706
> >
> > On 10/22/05, Jeff Rodenburg <jeff.rodenburg@gmail.com> wrote:
> > >
> > > I have a custom sort that completes calculations on-the-fly, similar
> to
> > > the
> > > LIA distance sort. SortField type is Float. It works, but I need
> better
> > > performance. I'm wondering if there's a better way to do this.
> > >
> > > As a rule, the number of results returned in a given search will most
> > > often
> > > be a fraction of the total documents in the search indexes. For
> example,
> > > 1000 results would be a rather large result set for what I'm
> expecting.
> > > The
> > > aggregate index document count is in the range of 20 million.
> > >
> > > The standard process of looping through the TermDocs from readers for
> > the
> > > aggregate index seems wasteful in this scenario, given the relative
> > number
> > > of results to the overall size of the index. What are my options here?
> > >
> > > Thanks
> > > jeff
> > >
> > >
> >
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message