lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: TSDC, TopFieldCollector & co
Date Wed, 30 Sep 2009 14:57:17 GMT
If you want relevance sorting (Sort.Score not Sort.Relevance right?),
I'd think you want to use TopScoreDocCollector, not TopFieldCollector.
The only reason to use relevance with TopFieldCollector is if you you
are doing a nth sort with a field sort as well.

You don't really need to worry about things like turning off the max
score tracking here - its just going to be the first doc on the queue.

You also do want to specify whether or not to collect docs in order if
you care about performance:

  public static TopScoreDocCollector create(int numHits, boolean
docsScoredInOrder)

ie:

TopScoreDocCollector.create(nDocs, !weight.scoresDocsOutOfOrder());

Which means you just want option 1.

-- 
- Mark

http://www.lucidimagination.com



eks dev wrote:
> Hi All, 
>
> What is the best way to achieve the following and what are the differences, if I say
"I do not normalize scores, so I do not need max score tracking, I do not care if hits are
returned in doc id order, or any other order. I need only to get maxDocs *best scoring* documents":
>
> OPTION 1:
> TopDocs top = ixSearcher.search(q, filter, maxDocs);
>
> OPTION 2:
>    final TopScoreDocCollector tfc = TopScoreDocCollector.create(maxDocs, false);
>     ixSearcher.search(q, filter, tfc);
>     TopDocs top = tfc.topDocs();
>
>
> OPTION 3:    
>     final TopFieldCollector tfc = TopFieldCollector.create(Sort.RELEVANCE, maxDocs, 
>         false  /* fillFields */,
>         true   /* trackDocScores */,
>         false   /* trackMaxScore */,
>         false  /* docsInOrder */);
>
>     ixSearcher.search(q.weight(ixSearcher),filter, tfc);
>     TopDocs top = tfc.topDocs();
>
>
> what are the pros and cons?
> If I read javadoc correctly, 
> - OPTION 1 tracks max score and delivers doc Ids in order (suboptimal performance for
my case)   
> - OPTION 2 I do not know abut max score tracking, but doc Ids are not required to be
in order
> - OPTION 3 looks like exactly what I want, but one performance comment in javadoc about
Sort.RELEVANCE made me think if that is the fastest way? 
>
> What would be recommended here, any other options to achieve the fastest search with
above defined conditions (no max score tracking and doc id order irrelevant)?  OPTIN2 looks
nice, but as said, I am not sure about max score tracking?
>
> Thanks,
> eks
>
>
>       
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>   




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message