lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephane Nicoll" <stephane.nic...@gmail.com>
Subject Re: hybrid query (lucene + db)
Date Fri, 02 May 2008 15:33:33 GMT
I had a look to this but didn't find anything that correspond to my problem.

Apparently there is a bug in Hibernate Search. If I use the same load
test on the same index with the same data with a direct access to the
lucene API, I get much better performance (and no deadlock on
SegmentReader).

I will report the problem there.

Thanks,
St├ęphane






On Thu, May 1, 2008 at 11:15 AM, mark harwood <markharw00d@yahoo.co.uk> wrote:
> The issue here is a general one of trying to perform an efficient join between an external
resource (rdbms) and Lucene.
>  This experiment may be of interest:
>     http://issues.apache.org/jira/browse/LUCENE-434
>
>  KeyMap.java embodies the core service which translates from lucene doc ids to DB primary
keys or vice versa.
>  There are a couple of implementations of KeyMap that are not optimal (they pre-date
Lucene's FieldCache) but it may give you food for thought.
>
>  Cheers
>  Mark
>
>
>
>
>  ----- Original Message ----
>  From: Stephane Nicoll <stephane.nicoll@gmail.com>
>  To: java-user@lucene.apache.org
>  Sent: Thursday, 1 May, 2008 9:00:33 AM
>  Subject: hybrid query (lucene + db)
>
>  Hi there,
>
>  We're using lucene with Hibernate search and we're very happy so far
>  with the performance and the usability of lucene. We have however a
>  specific use cases that prevent us to use only lucene: spatial
>  queries. I already sent a mail on this list a while back about the
>  problem and we started investigating multiple solutions.
>
>  When the user selects a geographic area and some keywords we do the following:
>
>  * Perform a search on the lucene index for the keywords with a
>  projection that returns only the primaryKey of the element sorted by
>  primary key
>  * Perform a search on the database with other criterias and a
>  projection that returns only the primary key of the elements
>  * Iterate on both list to find N matching IDs, optionally with paging
>  (some from X to X + N where X is the first result of the page)
>  * Run a query on the database to return the actual objects (select a
>  from MyClass a where a.id IN (the list of matching IDs) ) We limit the
>  page to 1000 results
>
>  We have searched a way to optimize the queries and to avoid to consume
>  too much memory, knowing that we must support paging.
>
>  With a single user a search by kewyords takes 30msec to complete, a
>  search by box takes 45msec. With both (keywords + spatial area)  it
>  takes 300msec
>
>  With 10 concurrent users, a search by keywords takes 150msec/user  but
>  for both it takes 3 sec/user !!!
>
>  I had the profiler running on this scenario and I've found that *all*
>  threads are waiting on org.apache.lucene.index.SegmentReader. I then
>  configured Hibernate Search to use a separate index reader per thread.
>  The deadlocks disappeared but it's still very slow (2.8sec).
>
>  Some questions:
>
>  * Does anyone knows where the deadlocks on SegmentReader are coming from?
>  * Is the sorting on the primary keys a bad idea regarding performance
>  and memory usage?
>  * Does anyone has an idea to perform this kind of hybrid query in an
>  efficient way?
>
>  I am using lucene 2.3.1 and Hibernate Search 3.0.1. I already ask for
>  support on the Hibernate Search forum but did not get any answer so
>  far.
>
>  Thanks,
>  St├ęphane
>
>  --
>  Large Systems Suck: This rule is 100% transitive. If you build one,
>  you suck" -- S.Yegge
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>  For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>
>
>
>
>
>       __________________________________________________________
>  Sent from Yahoo! Mail.
>  A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html
>
>  ---------------------------------------------------------------------
>  To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>  For additional commands, e-mail: java-user-help@lucene.apache.org
>
>



-- 
Large Systems Suck: This rule is 100% transitive. If you build one,
you suck" -- S.Yegge

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message