lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Goodell <good...@gmail.com>
Subject Re: lucene integration with relational database
Date Tue, 18 Jan 2005 19:22:39 GMT
I do these kinds of queries all the time.  I found that the fastest
performance for my collections (millions of documents) came from
subclassing Filter using the set of primary keys from the database to
make the Filter, and then doing the query with the
Searcher.search(query,filter) interface.  I was previously using the
in memory merge, but the memory requirements were crashing the JVM
when we had a lot of simultaneous users.

- andy g


On Sat, 15 Jan 2005 23:03:00 +0530, sunil goyal <sunilgoyal@gmail.com> wrote:
> Hi all,
> 
> Thanks for the answers. I was looking for a best practice guide to do
> the same. If anyone already had had some practical experience with
> such kind of queries, it will be great to know his thoughts.
> 
> Thanks
> 
> Regards
> Sunil
> 
> 
> On Sat, 15 Jan 2005 09:00:35 -0800, jian chen <chenjian1227@gmail.com> wrote:
> > Hi,
> >
> > Still minor additions to the steps:
> >
> > 1) do lucene query and get the hits (keyed by the database primary
> > key, for example, employee id)
> >
> > 2) do database query and get the primary keys (i.e., employee id) for
> > the result rows, ordered by primary key
> >
> > 3) for each lucene query result, look into db query result and see if
> > the primary key is there (since db query result is sorted already by
> > primary key, so, a binary search could be applied)
> >
> > if the primary key is there, store this result, else, discard it
> >
> > 4) when top k results are obtained, send back to the user.
> >
> > How does this sound?
> >
> > Cheers,
> >
> > Jian
> >
> > On Sat, 15 Jan 2005 08:36:16 -0800, jian chen <chenjian1227@gmail.com> wrote:
> > > Hi,
> > >
> > > To further the discussion. Would the following detailed steps work:
> > >
> > > 1) do lucene query and get the hits (keyed by the database primary
> > > key, for example, employee id)
> > >
> > > 2) do database query and get the primary keys (i.e., employee id) for
> > > the result rows, ordered by primary key
> > >
> > > 3) merge the two sets of primary keys (for example, in memory two-way
> > > merge) and take the top k records
> > >
> > > 4) display the top k result rows
> > >
> > > Cheers,
> > >
> > > Jian
> > >
> > > On Sat, 15 Jan 2005 12:40:04 +0000, Peter Pimley <ppimley@semantico.com>
wrote:
> > > > sunil goyal wrote:
> > > >
> > > > >But can i do for instance a unified query where i want to take certain
> > > > >parameters (non-textual e.g. age < 30 ) from relational databases
and
> > > > >keywords from the lucene index ?
> > > > >
> > > > >
> > > > >
> > > > When I have had to do this, I've done the lucene search first, and then
> > > > manually filtered out the hits that fail on other criteria.
> > > >
> > > > I'd suggest doing that first (as it's easiest) and then seeing whether
> > > > the performance is acceptable.
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > > > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> > > >
> > > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> >
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message