lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Wettin <karl.wet...@gmail.com>
Subject Re: Lucene index on relational data
Date Fri, 11 Apr 2008 18:13:34 GMT
Hi Rajesh,

I think you are looking for ParallelReader.

<http://lucene.apache.org/java/2_0_0/api/org/apache/lucene/index/ParallelReader.html>

public class ParallelReader
extends IndexReader

An IndexReader which reads multiple, parallel indexes. Each index added 
must have the same number of documents, but typically each contains 
different fields. Each document contains the union of the fields of all 
documents with the same document number. When searching, matches for a 
query term are from the first index added that has the field.

This is useful, e.g., with collections that have large fields which 
change rarely and small fields that change more frequently. The smaller 
fields may be re-indexed in a new index and both indexes may be searched 
together.

Warning: It is up to you to make sure all indexes are created and 
modified the same way. For example, if you add documents to one index, 
you need to add the same documents in the same order to the other 
indexes. Failure to do so will result in undefined behavior.



     karl

Rajesh parab skrev:
> Hi,
> 
> We are using Lucene 2.0 to index data stored inside
> relational database. Like any relational database, our
> database has quite a few one-to-one and one-to-many
> relationships. For example, let’s say an Object A has
> one-to-many relationship with Object X and Object Y.
> As we need to de-normalize relational data as
> key-value pairs before storing it inside Lucene index,
> we have de-normalized these relationships (Object X
> and Object Y) while building an index on Object A.
> 
> We have large no of such object relationships and most
> of the times, the related objects are modified more
> frequently than the base objects. For example, in our
> above case, objects X and Y are updated in the system
> very frequently, whereas Object A is not updated that
> often. Still, we will need to update Object A entries
> inside the index, every time its related objects X
> and/or Y are modified.
> 
> To avoid the above situation, we were thinking of
> having 2 separate indexes – first index will only
> index data of base objects (Object A in above example)
> and second index will contain data about its
> relationship objects (Object X and Y above), which are
> updated more frequently. This way, the more frequent
> updates to Object X and Y will only impact second
> index that stores relationship information and reduce
> the cost to re-index object A. However, I don’t think,
> MultiSearcher will be helpful if we want to search for
> data which spans across both indexes (e.g. some fields
> of Object A in first index and some fields of Object X
> or Y in second index).
> 
> Do we have any option in Lucene to handle such
> scenario? Can we search across multiple indexes which
> have some relationships between them and search for
> fields that span across these indexes?
> 
> Regards,
> Rajesh
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message