lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Making TopDocCollector a bit more consumable
Date Mon, 24 Aug 2009 14:52:58 GMT
I think we've had a similar discussion on this issue (as part of the JIRA
issue), and the reason for not defaulting to anything was back-compat.

For example, we know that not tracking doc scores is better when you simply
sort by a field. But we can't have a default that says "don't track doc
scores", since if people will use it - they might break. On the other hand,
defaulting in 2.9 to track doc scores is not good either, because we want to
stop tracking scores when you sort ...

So the outcome was that the "easy" search methods on Searcher pick the best
defaults for you (and we've documented that in 3.0 those methods will stop
tracking scores etc.) and if you choose to instantiate your own
TopFieldCollector, then you probably know what you're doing, and therefore
defaults are not that important there.

I guess back-compat wise we can say that in 2.9 there is a "create" method
which picks certain defaults and will change in 3.0. But I think the bigger
question is if someone instantiates TFC, does he do it because he wants to
override Lucene's Searcher defaults? I guess the answer is not a definite
YES (because I can think of cases where I instantiate TFC for other purposes
than overriding Lucene's defaults), but is it perhaps MOST LIKELY?

The one parameter which I think may confuse people is w/ docsScoredInOrder -
that is only relevant if I use my own Scorer, which I think is a very
advanced thing. And if I need to instantiate TFC or TSDC, I may not know
what to pass there ... But here there is no good default either, because it
really depends on the query that is run. Perhaps we should add to the
javadocs something like "you can call query.weight().scoresDocsOutOfOrder to
instantiate the optimal TFC/TSDC"?

Shai

On Mon, Aug 24, 2009 at 5:40 PM, Mark Miller <markrmiller@gmail.com> wrote:

> I was just going to add actually:
>
> Yes you can just use the other Searcher methods. Perhaps thats just
> fine. I don't think this a large issue.
>
> But you could also use void search(Weight weight, Filter filter,
> Collector collector).
>
> I've created my own TopDocs collectors for a handful of reasons in the
> past.
>
> So I don't think this is a huge deal, but if you used the TopDoc
> collectors in the past,
> you just had to pass sort/numDocs - now that they are deprecated, if you
> happened to be
> using it - you go over to the new classes (after finding the new static
> factories) and are likely not sure what options to pick. Why not allow
> the same
> params and pick defaults that always work? People that want to eek out
> speed can tweak the
> longer param list.
>
> I agree - its not a huge deal - I guess it is more advanced use - but it
> was much easier to follow
> and use with the deprecated versions. Its gotten quite a bit more
> confusing.
>
> I'd still want to be able to play around with Collectors without being
> an expert.
>
> Just an idea though - I don't think its 100% necessary. When I see
> advanced options that are more for optimization though,
> I like to have defaults so that I don't have to understand everything
> perfectly before I use it.
>
> - Mark
>
> Yonik Seeley wrote:
> > But creating the collector is expert use, right?
> > The normal use would be from Searcher:
> > TopDocs search(Query query, int n)
> > TopDocs search(Query query, Filter filter, int n)
> >
> >
> > -Yonik
> > http://www.lucidimagination.com
> >
> >
> >
> > On Mon, Aug 24, 2009 at 10:15 AM, Mark Miller<markrmiller@gmail.com>
> wrote:
> >
> >> Hey all,
> >>
> >> Hits, which used to be the non expert search API has been deprecated -
> >> so TopDocs is now
> >> essentially the non expert search API. But when you go to use it you are
> >> greeted with:
> >>
> >>  public static TopFieldCollector create(Sort sort, int numHits,
> >>      boolean fillFields, boolean trackDocScores, boolean trackMaxScore,
> >>      boolean docsScoredInOrder)
> >>
> >> and
> >>
> >>  public static TopScoreDocCollector create(int numHits, boolean
> >> docsScoredInOrder) {
> >>
> >>    if (docsScoredInOrder) {
> >>      return new InOrderTopScoreDocCollector(numHits);
> >>    } else {
> >>      return new OutOfOrderTopScoreDocCollector(numHits);
> >>    }
> >>
> >>  }
> >>
> >> Woah ! Think of the poor noobies ;)
> >>
> >> I don't know if I want my docs scored in order. Seriously, I don't. Its
> >> sounds nice though. And fill fields? Please do I guess :)
> >>
> >> What do you think about having versions that default to something
> >> reasonable ? And you just have to give numhits and sort, numhits?
> >>
> >> This API now has a dual role IMO - expert and non expert.
> >>
> >> --
> >> - Mark
> >>
> >> http://www.lucidimagination.com
> >>
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-dev-help@lucene.apache.org
> >>
> >>
> >>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> >
> >
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>

Mime
View raw message