lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Jain" <Eric.J...@isb-sib.ch>
Subject Re: Mixing database and lucene searches
Date Tue, 11 May 2004 08:47:31 GMT
> If you *really* don't want to (or can't) put all the searchable fields
> into lucene, then you are going to need to do a "lucene-db" join.

Here are two good reasons:

1. Range queries
2. Sorting

Yes, Lucene can do both, but I find that in both cases the approach
Lucene uses is not suitable for large data sets, given limited hardware
resources.


> Hits hits = searcher.search(new TermQuery("text", "foo")
> Set hitPKs = new Set();
> for each doc in hits:
>    hitPKs.put(doc.getField("pk"))

Retrieving even one custom field for every document of a possibly large
data set
can end up being very slow, it seems. This complicates things a lot...

Unfortunately, I am not aware of any good solutions for combining Lucene
with a relational database, given the requirements listed above.
However, one promising approach may involve combing Lucene with the new
Berkely DB JE:

1. Use Lucene to create a bitset of results (position = docid).
2. Use BDB to iterate through primary keys, sorted and restricted by one
(or more?) of several criteria.
   3. For each primary key, look up docid (this database must be rebuilt
every time the index is modified).
   4. If docid set in result bitset, report result.

If anyone has tried anything similar, I'd be interested to know!


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message