lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Glen Stampoultzis" <gst...@iinet.net.au>
Subject Re: Mixing database and lucene searches
Date Tue, 11 May 2004 13:28:06 GMT

"Eric Jain" <Eric.Jain@isb-sib.ch> wrote in message
news:007e01c43734$98fe3480$c300000a@caliente...
> > If you *really* don't want to (or can't) put all the searchable fields
> > into lucene, then you are going to need to do a "lucene-db" join.
>
> Here are two good reasons:
>
> 1. Range queries
> 2. Sorting
>
> Yes, Lucene can do both, but I find that in both cases the approach
> Lucene uses is not suitable for large data sets, given limited hardware
> resources.
>
>
> > Hits hits = searcher.search(new TermQuery("text", "foo")
> > Set hitPKs = new Set();
> > for each doc in hits:
> >    hitPKs.put(doc.getField("pk"))
>
> Retrieving even one custom field for every document of a possibly large
> data set
> can end up being very slow, it seems. This complicates things a lot...
>
> Unfortunately, I am not aware of any good solutions for combining Lucene
> with a relational database, given the requirements listed above.
> However, one promising approach may involve combing Lucene with the new
> Berkely DB JE:
>
> 1. Use Lucene to create a bitset of results (position = docid).
> 2. Use BDB to iterate through primary keys, sorted and restricted by one
> (or more?) of several criteria.
>    3. For each primary key, look up docid (this database must be rebuilt
> every time the index is modified).
>    4. If docid set in result bitset, report result.
>
> If anyone has tried anything similar, I'd be interested to know!

Why Berkely DB?  This sounds like it would work regardless of the database.

Regards,

Glen





---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message