lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mohammad Norouzi" <mnr...@gmail.com>
Subject Re: A solution to HitCollector-based searches problems
Date Sun, 11 Mar 2007 05:50:49 GMT
Hi Oramas
if I use that jar file, it conflicts with lucene-core.jar file. for exampl,
IndexSearcher class that you defined is different from the original one. Do
I have to remove the lucene-core jar file?
if yes, how about the other original classes

On 3/8/07, oramas martín <jloramas@gmail.com> wrote:
>
> Hello,
>
> I have just added some search implementation samples based on this
> collector
> solution, to easy the use and understanding or it:
>
>     - KeywordSearch: Extract the terms (and frequency) found in a list of
> fields
>                      from the results of a query/filter search
>
>     - GoogleSearch: Return an ordered search result grouped a la Google,
> based
>                     on the terms found in a list of fields
>
>     - GetFieldNamesOp: Operation to mimic the getFieldNames method of
>                        IndexReader but using a searcher. With it, it is
> possible
>                        to explore the fields of remote indexes.
>
> See http://sourceforge.net/projects/lucollector/ for the source code (
> lu-collector-src-sampleop-0.8.zip).
>
> Regards,
> José L. Oramas
>
> On 2/26/07, oramas martín <jloramas@gmail.com> wrote:
> >
> >
> > Hello,
> >
> > As you probably know, the HitCollector-based search API is not meant to
> > work remotely, because it will generate a RPC-callback for every
> non-zero
> > score.
> >
> > There is another problem with MultiSearcher-HitCollector-based search
> > which knows nothing about mix HitCollector based searches (not to say it
> has
> > hardcode the way to mix TopDocs for the score and for the Sort
> searches).
> > Also the ParallelMultiSearcher inherits this problems and is unable to
> > parallelize the HitCollector-based searcher.
> >
> > A final problem with the HitCollector-based search is related to the
> lost
> > of a limit in the results, as the Hits class implements thought the
> > getMoreDocs() function, and lazy loading and caching of documents it
> does.
> >
> >
> > To solve those problems it is necessary a factory (HitCollectorSource)
> > able to generate collectors for single (SingleHitCollector) an multi
> > (MultiHitCollector) searches, and a new search method in the
> > Searchable interface for it. To avoid modifications to the lucene core,
> the
> > later requirement is NOT IMPLEMENTED in the library I have just created.
> > Instead, an ugly solution, a wrapper for those searchers
> > (SearcherHCSourceWrapper) and a Filter wrapper
> (FilterHitCollectorSource) to
> > carry the factory-based searches, is provided.
> >
> > Each collector is based in a two steps sequence, one for collecting hits
> > or subsearcher results, and another for generating the final result.
> >
> > Also, just in case you don't want to add a wrapper to each searcher of
> > your project, there is an adapted version of IndexSearcher,
> MultiSearcher
> > and ParallelMultiSearcher (only for version 2.1) modified exactly the
> same
> > way the wrapper class SearcherHCSourceWrapper does. Just put them in
> your
> > class-path (before the Lucene core jar) and you will be using the new
> > collector interfaces without modifying your code.
> >
> > There are some unit testing (copied and adapted from the Lucene
> 2.1distribution).
> >
> > See http://sourceforge.net/projects/lucollector/ for the jar files and
> the
> > code.
> >
> > If you find it interesting to complement the Lucene project, tell me how
> > to put it in the contribution area.
> >
> > Regards,
> > José L. Oramas
> >
>



-- 
Regards,
Mohammad
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message