lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "hu andy" <andyh...@gmail.com>
Subject Re: About the use of HitCollector
Date Wed, 09 Aug 2006 03:05:54 GMT
Hey, Ryan, Thanks for your reply.
The scenario is I use a custom Filter which get some information from a
database table which consists of hundreds of thousands of rows. I use the
IndexSearcher.search(query, filter, hitcollector). I found it was consumed
more time with filter than that without no filter.
Can you give me some advice?


2006/8/8, Ryan O'Hara <ohara@genome.chop.edu>:

> Hey Andy,
>
> If you have enough RAM, try using FieldCache:
>
> String[] fieldYouWant = FieldCache.DEFAULT.getStrings
> (searcher.getIndexReader(), "fieldYouWant");
> searcher.search(query, new HitCollector(){
>        public void collect(int doc, float score){
>                doWhatYouWant(fieldYouWant[doc]);
>        }
> }
>
> If you need all results, this is probably the fastest method.
> However, this is assuming you have some way of storing the
> fieldYouWant array.  For all indexes, especially those containing
> many documents, it is a good idea to store the fieldYouWant array,
> since creating this array creates serious overhead.
>
> Best,
> Ryan
>
> On Aug 7, 2006, at 9:48 AM, hu andy wrote:
>
> > Martin, Thank you for your reply.
> >
> > But the Lucene API said:
> > This is called in an inner search loop. For good search performance,
> > implementations of this method should not call
> > Searcher.doc(int)<file:///E:/java/IR%20Library/lucene-1.9.1/
> > lucene-1.9.1/docs/api/org/apache/lucene/search/Searcher.html#doc
> > (int)>or
> > IndexReader.document(int)<file:///E:/java/IR%20Library/lucene-1.9.1/
> > lucene-1.9.1/docs/api/org/apache/lucene/index/
> > IndexReader.html#document(int)>on
> > every document number encountered
> >
> > Because I have to check a field in the document to determine whether I
> > should return the document. The total number of documents is about two
> > hundred thousand. So I'm afraid the
> > performance
> >
> >
> > 2006/8/7, Martin Braun <mbraun@uni-hd.de>:
> >>
> >> hi andy,
> >> > How can I  use HitCollector to iterate over every returned
> >> document?
> >>
> >> You have to override the function collect for the HitCollector
> >> class and
> >> then store the retrieved Data in an array or map.
> >>
> >> Here is just a source-code scratch (is = IndexSearcher)
> >>
> >>                is.search(query, null, new HitCollector(){
> >>                                public void collect(int docID,
> >> float score)
> >>                                    {
> >>                                        Document doc = is.doc(docID);
> >>                                        titles[docID] = doc.get
> >> ("title");
> >>                                    }
> >>                        });
> >>
> >>
> >> hth,
> >> martin
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message