lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: searching in 2 indexes
Date Mon, 15 Dec 2008 16:46:45 GMT
Stop it right now <G>. You've gotta take off your DB hat and put
on your searching hat to get the most out of Lucene. So I'd
think about the following:

1> Why do you have two indexes? Why not just put all
     the data into a single index? The fields are disjoint anyway....
     Note that there is no requirement that all documents in an
     index have the same fields either, if that makes things easier....

2> Disk space is cheap. Very cheap. I know it goes against the
     grain to de-normalize your data, but think about doing just that.
     The idea is to be able to submit a single *search* where
     each document returned is complete rather than thinking in
     terms of joins etc....

3> Storing and indexing are two separate concepts. Your index (at
     least the searchable part) won't grow if you store (but don't index)
     lots of data. So if you need to pile a bunch of junk into your records
     but *not* search them, having a humongous index where most
     of it is simply stored isn't nearly as costly as you might think.

4> Some of this depends upon how much data you're talking about. If
     both indexes total 10M, there's no reason in the world to keep them
     separate. If they total 100G, that's another story. Some more
     details would be helpful.

5> I almost guarantee that if you've merely translated database tables
     into Lucene indexes on a one-for-one basis, you won't be very
     satisfied with the results........

Best
Erick

On Mon, Dec 15, 2008 at 11:33 AM, Chris Bamford <chris.bamford@scalix.com>wrote:

> Hi
>
> I have a situation where I have two related indexes which are logically
> linked by a common field called INDEXID. All other fields differ between the
> two indexes. For any given INDEXID I would like to be able to retrieve the
> matching pair of documents, one from each index. (Logically this is an AND
> /i.e. /only return anything if there is a document with INDEXID /X/ in index
> A *and* in index B.)
>
> Is there a nifty way to do this with a single query or must I first search
> one, then the other?
> I thought perhaps MultiSearcher might do it, but now I'm not so sure ...
>
> Thanks...
>
> - Chris
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message