lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Searcher javadoc problem
Date Sun, 04 Oct 2009 01:23:51 GMT
Gotchya - that clears up my mind. I know your an advanced user, so it
threw me for a loop that you would be using Hits like a Collector. Just
have been seeing that a lot lately.

Just read to much into: So what is the appropriate documentation for
getting all "hits"?

Another option (of course) is to maintain your own Hits class. Sounds
like working up something with a Collector on your own would be better
though - why compute the score if you don't need it. Hits caching was
rarely that useful either.

DM Smith wrote:
> It makes sense if you understand the context. We make each verse of a
> Bible a document. There are about 36000 docc in a Bible. We want a
> user to find all the verses that match there search to give the count
> of total hits. We then show slices of the hits from first hit to last
> im document order typically about 100 at a time. Scoring is unimportant.
>
> The user can also choose to prioritize and limit the results. This
> uses scoring and the top docs. This is not the users prefered search.
>
> So I don't mind being nasty. But having looked at it I think it would
> be better to have a non-scoring collector that is a co-process that
> w/an iterator interface gets the next doc on demand, from first doc in
> index to last.
>
> -- DM 
>
>
> On Oct 3, 2009, at 6:12 PM, Mark Miller <markrmiller@gmail.com> wrote:
>
>> You used Hits to get all that hits? Nasty man - thats we deprecated that
>> class - even though the JavaDoc warns you thats a major speed trap,
>> everyone still did it ... use a Collector.
>>
>> Your right though - it shouldn't point to IndexSearcher.search(Query)
>> after that - it should point to IndexSearcher.search(Query, int)
>>
>> Goto fix that.
>>
>> DM Smith wrote:
>>> I'm working on migrating my code to 2.9. And I'm trying to figure out
>>> what to do. Along the way I found a circular argument in the JavaDoc
>>> for Searcher. BTW, this is not a user question.
>>>
>>> My current code calls:
>>>                Hits hits = searcher.search(query);
>>>
>>> The JavaDoc for it says:
>>>  /** Returns the documents matching <code>query</code>.
>>>   * @throws BooleanQuery.TooManyClauses
>>>   * @deprecated Hits will be removed in Lucene 3.0. Use
>>>   * {@link #search(Query, Filter, int)} instead.
>>>   */
>>>  public final Hits search(Query query) throws IOException {
>>>    return search(query, (Filter)null);
>>>  }
>>>
>>> However, search(Query, Filter, int) is not quite appropriate as I need
>>> all hits. I guess I could pass null for filter and MAX_INT.
>>>
>>> So, I found search(Query, Collector), which seems most appropriate.
>>> (Not sure though, but I'll figure it out.) However, the JavaDoc for it
>>> says:
>>>  /** Lower-level search API.
>>>  *
>>>  * <p>{@link Collector#collect(int)} is called for every matching
>>> document.
>>>  *
>>>  * <p>Applications should only use this if they need <i>all</i>
of the
>>>  * matching documents.  The high-level search API ({@link
>>>  * Searcher#search(Query)}) is usually more efficient, as it skips
>>>  * non-high-scoring hits.
>>>  * <p>Note: The <code>score</code> passed to this method is
a raw
>>> score.
>>>  * In other words, the score will not necessarily be a float whose
>>> value is
>>>  * between 0 and 1.
>>>  * @throws BooleanQuery.TooManyClauses
>>>  */
>>> public void search(Query query, Collector results)
>>>   throws IOException {
>>>   search(createWeight(query), null, results);
>>> }
>>>
>>> But Searcher.search(Query) is deprecated.
>>>
>>> So what is the appropriate documentation for getting all "hits"? Seems
>>> to say, "Don't do that"
>>>
>>> -- DM
>>>
>>>
>>
>>
>> -- 
>> - Mark
>>
>> http://www.lucidimagination.com
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-dev-help@lucene.apache.org
>


-- 
- Mark

http://www.lucidimagination.com




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message