lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: A solution to HitCollector-based searches problems
Date Mon, 26 Feb 2007 00:16:27 GMT
However, see Wiki HowToContribute: http://wiki.apache.org/jakarta- 
lucene/HowToContribute if you wish to donate your code.

-Grant
On Feb 25, 2007, at 6:56 PM, oramas martín wrote:

> Hello,
>
> As you probably know, the HitCollector-based search API is not  
> meant to work
> remotely, because it will generate a RPC-callback for every non- 
> zero score.
>
> There is another problem with MultiSearcher-HitCollector-based  
> search which
> knows nothing about mix HitCollector based searches (not to say it has
> hardcode the way to mix TopDocs for the score and for the Sort  
> searches).
> Also the ParallelMultiSearcher inherits this problems and is unable to
> parallelize the HitCollector-based searcher.
>
> A final problem with the HitCollector-based search is related to  
> the lost of
> a limit in the results, as the Hits class implements thought the
> getMoreDocs() function, and lazy loading and caching of documents  
> it does.
>
>
> To solve those problems it is necessary a factory  
> (HitCollectorSource) able
> to generate collectors for single (SingleHitCollector) an multi
> (MultiHitCollector) searches, and a new search method in the
> Searchable interface for it. To avoid modifications to the lucene  
> core, the
> later requirement is NOT IMPLEMENTED in the library I have just  
> created.
> Instead, an ugly solution, a wrapper for those searchers
> (SearcherHCSourceWrapper) and a Filter wrapper  
> (FilterHitCollectorSource) to
> carry the factory-based searches, is provided.
>
> Each collector is based in a two steps sequence, one for collecting  
> hits or
> subsearcher results, and another for generating the final result.
>
> Also, just in case you don't want to add a wrapper to each searcher  
> of your
> project, there is an adapted version of IndexSearcher,  
> MultiSearcher and
> ParallelMultiSearcher (only for version 2.1) modified exactly the  
> same way
> the wrapper class SearcherHCSourceWrapper does. Just put them in your
> class-path (before the Lucene core jar) and you will be using the new
> collector interfaces without modifying your code.
>
> There are some unit testing (copied and adapted from the Lucene
> 2.1distribution).
>
> See http://sourceforge.net/projects/lucollector/ for the jar files  
> and the
> code.
>
> If you find it interesting to complement the Lucene project, tell  
> me how to
> put it in the contribution area.
>
> Regards,
> José L. Oramas

--------------------------
Grant Ingersoll
Center for Natural Language Processing
http://www.cnlp.org

Read the Lucene Java FAQ at http://wiki.apache.org/jakarta-lucene/ 
LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message