lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Making TopDocCollector a bit more consumable
Date Mon, 24 Aug 2009 15:00:36 GMT
Thanks Shai. That all makes sense to me.

bq. Perhaps we should add to the javadocs something like "you can call
query.weight().scoresDocsOutOfOrder to instantiate the optimal TFC/TSDC"?

I guess this is all I would argue for as well - basically a bit more
informative javadoc for scoreOutOfOrder:

TopScoreDocCollector:

   * Creates a new {@link TopScoreDocCollector} given the number of hits to
   * collect and whether documents are scored in order by the input
   * {@link Scorer} to {@link #setScorer(Scorer)}.

Shai Erera wrote:
> I think we've had a similar discussion on this issue (as part of the
> JIRA issue), and the reason for not defaulting to anything was
> back-compat.
>
> For example, we know that not tracking doc scores is better when you
> simply sort by a field. But we can't have a default that says "don't
> track doc scores", since if people will use it - they might break. On
> the other hand, defaulting in 2.9 to track doc scores is not good
> either, because we want to stop tracking scores when you sort ...
>
> So the outcome was that the "easy" search methods on Searcher pick the
> best defaults for you (and we've documented that in 3.0 those methods
> will stop tracking scores etc.) and if you choose to instantiate your
> own TopFieldCollector, then you probably know what you're doing, and
> therefore defaults are not that important there.
>
> I guess back-compat wise we can say that in 2.9 there is a "create"
> method which picks certain defaults and will change in 3.0. But I
> think the bigger question is if someone instantiates TFC, does he do
> it because he wants to override Lucene's Searcher defaults? I guess
> the answer is not a definite YES (because I can think of cases where I
> instantiate TFC for other purposes than overriding Lucene's defaults),
> but is it perhaps MOST LIKELY?
>
> The one parameter which I think may confuse people is w/
> docsScoredInOrder - that is only relevant if I use my own Scorer,
> which I think is a very advanced thing. And if I need to instantiate
> TFC or TSDC, I may not know what to pass there ... But here there is
> no good default either, because it really depends on the query that is
> run. Perhaps we should add to the javadocs something like "you can
> call query.weight().scoresDocsOutOfOrder to instantiate the optimal
> TFC/TSDC"?
>
> Shai
>
> On Mon, Aug 24, 2009 at 5:40 PM, Mark Miller <markrmiller@gmail.com
> <mailto:markrmiller@gmail.com>> wrote:
>
>     I was just going to add actually:
>
>     Yes you can just use the other Searcher methods. Perhaps thats just
>     fine. I don't think this a large issue.
>
>     But you could also use void search(Weight weight, Filter filter,
>     Collector collector).
>
>     I've created my own TopDocs collectors for a handful of reasons in
>     the past.
>
>     So I don't think this is a huge deal, but if you used the TopDoc
>     collectors in the past,
>     you just had to pass sort/numDocs - now that they are deprecated,
>     if you
>     happened to be
>     using it - you go over to the new classes (after finding the new
>     static
>     factories) and are likely not sure what options to pick. Why not allow
>     the same
>     params and pick defaults that always work? People that want to eek out
>     speed can tweak the
>     longer param list.
>
>     I agree - its not a huge deal - I guess it is more advanced use -
>     but it
>     was much easier to follow
>     and use with the deprecated versions. Its gotten quite a bit more
>     confusing.
>
>     I'd still want to be able to play around with Collectors without being
>     an expert.
>
>     Just an idea though - I don't think its 100% necessary. When I see
>     advanced options that are more for optimization though,
>     I like to have defaults so that I don't have to understand everything
>     perfectly before I use it.
>
>     - Mark
>
>     Yonik Seeley wrote:
>     > But creating the collector is expert use, right?
>     > The normal use would be from Searcher:
>     > TopDocs search(Query query, int n)
>     > TopDocs search(Query query, Filter filter, int n)
>     >
>     >
>     > -Yonik
>     > http://www.lucidimagination.com
>     >
>     >
>     >
>     > On Mon, Aug 24, 2009 at 10:15 AM, Mark
>     Miller<markrmiller@gmail.com <mailto:markrmiller@gmail.com>> wrote:
>     >
>     >> Hey all,
>     >>
>     >> Hits, which used to be the non expert search API has been
>     deprecated -
>     >> so TopDocs is now
>     >> essentially the non expert search API. But when you go to use
>     it you are
>     >> greeted with:
>     >>
>     >>  public static TopFieldCollector create(Sort sort, int numHits,
>     >>      boolean fillFields, boolean trackDocScores, boolean
>     trackMaxScore,
>     >>      boolean docsScoredInOrder)
>     >>
>     >> and
>     >>
>     >>  public static TopScoreDocCollector create(int numHits, boolean
>     >> docsScoredInOrder) {
>     >>
>     >>    if (docsScoredInOrder) {
>     >>      return new InOrderTopScoreDocCollector(numHits);
>     >>    } else {
>     >>      return new OutOfOrderTopScoreDocCollector(numHits);
>     >>    }
>     >>
>     >>  }
>     >>
>     >> Woah ! Think of the poor noobies ;)
>     >>
>     >> I don't know if I want my docs scored in order. Seriously, I
>     don't. Its
>     >> sounds nice though. And fill fields? Please do I guess :)
>     >>
>     >> What do you think about having versions that default to something
>     >> reasonable ? And you just have to give numhits and sort, numhits?
>     >>
>     >> This API now has a dual role IMO - expert and non expert.
>     >>
>     >> --
>     >> - Mark
>     >>
>     >> http://www.lucidimagination.com
>     >>
>     >>
>     >>
>     >>
>     >>
>     ---------------------------------------------------------------------
>     >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>     <mailto:java-dev-unsubscribe@lucene.apache.org>
>     >> For additional commands, e-mail:
>     java-dev-help@lucene.apache.org
>     <mailto:java-dev-help@lucene.apache.org>
>     >>
>     >>
>     >>
>     >
>     >
>     ---------------------------------------------------------------------
>     > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>     <mailto:java-dev-unsubscribe@lucene.apache.org>
>     > For additional commands, e-mail: java-dev-help@lucene.apache.org
>     <mailto:java-dev-help@lucene.apache.org>
>     >
>     >
>
>
>     --
>     - Mark
>
>     http://www.lucidimagination.com
>
>
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>     <mailto:java-dev-unsubscribe@lucene.apache.org>
>     For additional commands, e-mail: java-dev-help@lucene.apache.org
>     <mailto:java-dev-help@lucene.apache.org>
>
>


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message