lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gerard Sychay" <Gerard.Syc...@cchmc.org>
Subject Re: Mixing database and lucene searches
Date Tue, 11 May 2004 15:26:06 GMT
>>> "Eric Jain" <Eric.Jain@isb-sib.ch> 05/11/04 04:47AM >>>
> > Hits hits = searcher.search(new TermQuery("text", "foo")
> > Set hitPKs = new Set();
> > for each doc in hits:
> >    hitPKs.put(doc.getField("pk"))
> 
> Retrieving even one custom field for every document of a possibly
large
> data set
> can end up being very slow, it seems. This complicates things a
lot...

Glen, I don't know your application specifics, but if you are paging
results, there is no need to retrieve all the primary keys at once.  I
had a similar problem.  I ended up doing the following:

- Store ONLY the primary keys in the index.  Ideally, you only  need
two fields per Lucene Document: the tokenized text to be searched, and
stored corresponding primary key.
- Upon searching, get all the Hits as normal, say you get 10000 hits.
- But the first page only displays first 10 hits, so retrieve first 10
primary keys from the Hits, use these to form a SQL query and retrieve
any info you need from the DB.  This way, you only handling 10 documents
at a time, or howevery many per page.
In actual use, this is very fast.

HTH

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message