lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: TSDC, TopFieldCollector & co
Date Wed, 30 Sep 2009 15:12:38 GMT
I agree. If you need sort-by-score, it's better to use the "fast" search
methods. IndexSearcher will create the appropriate TSDC instance for you,
based on the Query that was passed.

If you need to create multiple Collectors and pass a kind of Multi-Collector
to IndexSearcher, then you should create TSDC according to Mark's example
above.

Shai

On Wed, Sep 30, 2009 at 4:57 PM, Mark Miller <markrmiller@gmail.com> wrote:

> If you want relevance sorting (Sort.Score not Sort.Relevance right?),
> I'd think you want to use TopScoreDocCollector, not TopFieldCollector.
> The only reason to use relevance with TopFieldCollector is if you you
> are doing a nth sort with a field sort as well.
>
> You don't really need to worry about things like turning off the max
> score tracking here - its just going to be the first doc on the queue.
>
> You also do want to specify whether or not to collect docs in order if
> you care about performance:
>
>  public static TopScoreDocCollector create(int numHits, boolean
> docsScoredInOrder)
>
> ie:
>
> TopScoreDocCollector.create(nDocs, !weight.scoresDocsOutOfOrder());
>
> Which means you just want option 1.
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
> eks dev wrote:
> > Hi All,
> >
> > What is the best way to achieve the following and what are the
> differences, if I say "I do not normalize scores, so I do not need max score
> tracking, I do not care if hits are returned in doc id order, or any other
> order. I need only to get maxDocs *best scoring* documents":
> >
> > OPTION 1:
> > TopDocs top = ixSearcher.search(q, filter, maxDocs);
> >
> > OPTION 2:
> >    final TopScoreDocCollector tfc = TopScoreDocCollector.create(maxDocs,
> false);
> >     ixSearcher.search(q, filter, tfc);
> >     TopDocs top = tfc.topDocs();
> >
> >
> > OPTION 3:
> >     final TopFieldCollector tfc =
> TopFieldCollector.create(Sort.RELEVANCE, maxDocs,
> >         false  /* fillFields */,
> >         true   /* trackDocScores */,
> >         false   /* trackMaxScore */,
> >         false  /* docsInOrder */);
> >
> >     ixSearcher.search(q.weight(ixSearcher),filter, tfc);
> >     TopDocs top = tfc.topDocs();
> >
> >
> > what are the pros and cons?
> > If I read javadoc correctly,
> > - OPTION 1 tracks max score and delivers doc Ids in order (suboptimal
> performance for my case)
> > - OPTION 2 I do not know abut max score tracking, but doc Ids are not
> required to be in order
> > - OPTION 3 looks like exactly what I want, but one performance comment in
> javadoc about Sort.RELEVANCE made me think if that is the fastest way?
> >
> > What would be recommended here, any other options to achieve the fastest
> search with above defined conditions (no max score tracking and doc id order
> irrelevant)?  OPTIN2 looks nice, but as said, I am not sure about max score
> tracking?
> >
> > Thanks,
> > eks
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message