I guess we can add something like this:
*
* @param docsScoredInOrder
* specifies if documents will be scored in doc ID order by the
* query. If you're not sure in advance, you can do the
following:
* <pre>
* boolean docsScoredInOrder =
!q.weight(searcher).scoresDocsOutOfOrder();
* TopScoreDocCollector tsdc =
TopScoreDocCollector.create(numHits, docsScoredInOrder);
* </pre>
*
* @see Weight#scoresDocsOutOfOrder()
I'm not even sure if the code example is needed ...
Do you want to add it to TSDC and TFC, or shall I open an issue for that?
Shai
On Mon, Aug 24, 2009 at 6:00 PM, Mark Miller <markrmiller@gmail.com> wrote:
> Thanks Shai. That all makes sense to me.
>
> bq. Perhaps we should add to the javadocs something like "you can call
> query.weight().scoresDocsOutOfOrder to instantiate the optimal TFC/TSDC"?
>
> I guess this is all I would argue for as well - basically a bit more
> informative javadoc for scoreOutOfOrder:
>
> TopScoreDocCollector:
>
> * Creates a new {@link TopScoreDocCollector} given the number of hits to
> * collect and whether documents are scored in order by the input
> * {@link Scorer} to {@link #setScorer(Scorer)}.
>
> Shai Erera wrote:
> > I think we've had a similar discussion on this issue (as part of the
> > JIRA issue), and the reason for not defaulting to anything was
> > back-compat.
> >
> > For example, we know that not tracking doc scores is better when you
> > simply sort by a field. But we can't have a default that says "don't
> > track doc scores", since if people will use it - they might break. On
> > the other hand, defaulting in 2.9 to track doc scores is not good
> > either, because we want to stop tracking scores when you sort ...
> >
> > So the outcome was that the "easy" search methods on Searcher pick the
> > best defaults for you (and we've documented that in 3.0 those methods
> > will stop tracking scores etc.) and if you choose to instantiate your
> > own TopFieldCollector, then you probably know what you're doing, and
> > therefore defaults are not that important there.
> >
> > I guess back-compat wise we can say that in 2.9 there is a "create"
> > method which picks certain defaults and will change in 3.0. But I
> > think the bigger question is if someone instantiates TFC, does he do
> > it because he wants to override Lucene's Searcher defaults? I guess
> > the answer is not a definite YES (because I can think of cases where I
> > instantiate TFC for other purposes than overriding Lucene's defaults),
> > but is it perhaps MOST LIKELY?
> >
> > The one parameter which I think may confuse people is w/
> > docsScoredInOrder - that is only relevant if I use my own Scorer,
> > which I think is a very advanced thing. And if I need to instantiate
> > TFC or TSDC, I may not know what to pass there ... But here there is
> > no good default either, because it really depends on the query that is
> > run. Perhaps we should add to the javadocs something like "you can
> > call query.weight().scoresDocsOutOfOrder to instantiate the optimal
> > TFC/TSDC"?
> >
> > Shai
> >
> > On Mon, Aug 24, 2009 at 5:40 PM, Mark Miller <markrmiller@gmail.com
> > <mailto:markrmiller@gmail.com>> wrote:
> >
> > I was just going to add actually:
> >
> > Yes you can just use the other Searcher methods. Perhaps thats just
> > fine. I don't think this a large issue.
> >
> > But you could also use void search(Weight weight, Filter filter,
> > Collector collector).
> >
> > I've created my own TopDocs collectors for a handful of reasons in
> > the past.
> >
> > So I don't think this is a huge deal, but if you used the TopDoc
> > collectors in the past,
> > you just had to pass sort/numDocs - now that they are deprecated,
> > if you
> > happened to be
> > using it - you go over to the new classes (after finding the new
> > static
> > factories) and are likely not sure what options to pick. Why not
> allow
> > the same
> > params and pick defaults that always work? People that want to eek
> out
> > speed can tweak the
> > longer param list.
> >
> > I agree - its not a huge deal - I guess it is more advanced use -
> > but it
> > was much easier to follow
> > and use with the deprecated versions. Its gotten quite a bit more
> > confusing.
> >
> > I'd still want to be able to play around with Collectors without
> being
> > an expert.
> >
> > Just an idea though - I don't think its 100% necessary. When I see
> > advanced options that are more for optimization though,
> > I like to have defaults so that I don't have to understand everything
> > perfectly before I use it.
> >
> > - Mark
> >
> > Yonik Seeley wrote:
> > > But creating the collector is expert use, right?
> > > The normal use would be from Searcher:
> > > TopDocs search(Query query, int n)
> > > TopDocs search(Query query, Filter filter, int n)
> > >
> > >
> > > -Yonik
> > > http://www.lucidimagination.com
> > >
> > >
> > >
> > > On Mon, Aug 24, 2009 at 10:15 AM, Mark
> > Miller<markrmiller@gmail.com <mailto:markrmiller@gmail.com>> wrote:
> > >
> > >> Hey all,
> > >>
> > >> Hits, which used to be the non expert search API has been
> > deprecated -
> > >> so TopDocs is now
> > >> essentially the non expert search API. But when you go to use
> > it you are
> > >> greeted with:
> > >>
> > >> public static TopFieldCollector create(Sort sort, int numHits,
> > >> boolean fillFields, boolean trackDocScores, boolean
> > trackMaxScore,
> > >> boolean docsScoredInOrder)
> > >>
> > >> and
> > >>
> > >> public static TopScoreDocCollector create(int numHits, boolean
> > >> docsScoredInOrder) {
> > >>
> > >> if (docsScoredInOrder) {
> > >> return new InOrderTopScoreDocCollector(numHits);
> > >> } else {
> > >> return new OutOfOrderTopScoreDocCollector(numHits);
> > >> }
> > >>
> > >> }
> > >>
> > >> Woah ! Think of the poor noobies ;)
> > >>
> > >> I don't know if I want my docs scored in order. Seriously, I
> > don't. Its
> > >> sounds nice though. And fill fields? Please do I guess :)
> > >>
> > >> What do you think about having versions that default to something
> > >> reasonable ? And you just have to give numhits and sort, numhits?
> > >>
> > >> This API now has a dual role IMO - expert and non expert.
> > >>
> > >> --
> > >> - Mark
> > >>
> > >> http://www.lucidimagination.com
> > >>
> > >>
> > >>
> > >>
> > >>
> > ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > <mailto:java-dev-unsubscribe@lucene.apache.org>
> > >> For additional commands, e-mail:
> > java-dev-help@lucene.apache.org
> > <mailto:java-dev-help@lucene.apache.org>
> > >>
> > >>
> > >>
> > >
> > >
> > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > <mailto:java-dev-unsubscribe@lucene.apache.org>
> > > For additional commands, e-mail: java-dev-help@lucene.apache.org
> > <mailto:java-dev-help@lucene.apache.org>
> > >
> > >
> >
> >
> > --
> > - Mark
> >
> > http://www.lucidimagination.com
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > <mailto:java-dev-unsubscribe@lucene.apache.org>
> > For additional commands, e-mail: java-dev-help@lucene.apache.org
> > <mailto:java-dev-help@lucene.apache.org>
> >
> >
>
>
> --
> - Mark
>
> http://www.lucidimagination.com
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>
>
|