Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 11442 invoked from network); 11 Apr 2008 21:41:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Apr 2008 21:41:22 -0000 Received: (qmail 77283 invoked by uid 500); 11 Apr 2008 21:41:15 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 77256 invoked by uid 500); 11 Apr 2008 21:41:15 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 77245 invoked by uid 99); 11 Apr 2008 21:41:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Apr 2008 14:41:15 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [212.27.42.28] (HELO smtp2-g19.free.fr) (212.27.42.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Apr 2008 21:40:30 +0000 Received: from smtp2-g19.free.fr (localhost.localdomain [127.0.0.1]) by smtp2-g19.free.fr (Postfix) with ESMTP id 8FDC612B6CE for ; Fri, 11 Apr 2008 23:40:39 +0200 (CEST) Received: from [192.168.1.100] (ze.garambrogne.net [82.227.122.98]) by smtp2-g19.free.fr (Postfix) with ESMTP id 4768612B6AA for ; Fri, 11 Apr 2008 23:40:39 +0200 (CEST) Message-Id: <1C113016-168D-41E0-8045-5EE58CF686B8@garambrogne.net> From: Mathieu Lecarme To: java-user@lucene.apache.org In-Reply-To: <563358.35242.qm@web52006.mail.re2.yahoo.com> Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Apple Message framework v919.2) Subject: Re: Lucene index on relational data Date: Fri, 11 Apr 2008 23:40:38 +0200 References: <563358.35242.qm@web52006.mail.re2.yahoo.com> X-Mailer: Apple Mail (2.919.2) X-Virus-Checked: Checked by ClamAV on apache.org Le 11 avr. 08 =E0 19:29, Rajesh parab a =E9crit : > Thanks for these pointers Mathieu. > > We have earlier looked at Compass, but the main issue > with database index is DB vendor support for BLOB > locator. I understand that Oracle provides has this > support to get the partial data from BLOB, but I guess > the simiar support is not available in SQL Server and > DB2. Our application currently supports all these 3 > databases. You misanderstood something. Compass can use JDBC Index, but it's only =20= an option, classical file index is available too. Other specific index =20= is GigaSpace and Terracotta, for cluster environment. > Secondly I am reading that search performance degrades > drastically with database index. You can build a Filter from JDBC query to mix it with Lucene search. =20 If your JDBC query use too much join, it will be slow, so, your Lucene =20= search, wich wait its Filter, will be slow two. Building a Filter =20 froma Set of id is not slow. > Will it be possible to partition data like main index > and relationship index using File System Lucne index > and search across these indexes? Yes. You can index unfolded data, wich take lot of space, or use two =20 query in two index. The first build a Filter for the second, just like =20= with the previous JDBC example. You can even cache the filter, like Solr does with its faceted search. M. > > > Regards, > Rajesh > > --- Mathieu Lecarme wrote: > >> Have a look at Compass 2.0M3 >> http://www.kimchy.org/searchable-cascading-mapping/ >> >> Your multiple index will be nice for massive write. >> In a classical >> read/write ratio, Compass will be much easier. >> >> M. >> >> Rajesh parab a =C3=A9crit : >>> Hi, >>> >>> We are using Lucene 2.0 to index data stored >> inside >>> relational database. Like any relational database, >> our >>> database has quite a few one-to-one and >> one-to-many >>> relationships. For example, let=E2=80=99s say an Object >> A has >>> one-to-many relationship with Object X and Object >> Y. >>> As we need to de-normalize relational data as >>> key-value pairs before storing it inside Lucene >> index, >>> we have de-normalized these relationships (Object >> X >>> and Object Y) while building an index on Object A. >>> >>> We have large no of such object relationships and >> most >>> of the times, the related objects are modified >> more >>> frequently than the base objects. For example, in >> our >>> above case, objects X and Y are updated in the >> system >>> very frequently, whereas Object A is not updated >> that >>> often. Still, we will need to update Object A >> entries >>> inside the index, every time its related objects X >>> and/or Y are modified. >>> >>> To avoid the above situation, we were thinking of >>> having 2 separate indexes =E2=80=93 first index will >> only >>> index data of base objects (Object A in above >> example) >>> and second index will contain data about its >>> relationship objects (Object X and Y above), which >> are >>> updated more frequently. This way, the more >> frequent >>> updates to Object X and Y will only impact second >>> index that stores relationship information and >> reduce >>> the cost to re-index object A. However, I don=E2=80=99t >> think, >>> MultiSearcher will be helpful if we want to search >> for >>> data which spans across both indexes (e.g. some >> fields >>> of Object A in first index and some fields of >> Object X >>> or Y in second index). >>> >>> Do we have any option in Lucene to handle such >>> scenario? Can we search across multiple indexes >> which >>> have some relationships between them and search >> for >>> fields that span across these indexes? >>> >>> Regards, >>> Rajesh >>> >>> __________________________________________________ >>> Do You Yahoo!? >>> Tired of spam? Yahoo! Mail has the best spam >> protection around >>> http://mail.yahoo.com >>> >>> >> > --------------------------------------------------------------------- >>> To unsubscribe, e-mail: >> java-user-unsubscribe@lucene.apache.org >>> For additional commands, e-mail: >> java-user-help@lucene.apache.org >>> >>> >>> >> >> >> > --------------------------------------------------------------------- >> To unsubscribe, e-mail: >> java-user-unsubscribe@lucene.apache.org >> For additional commands, e-mail: >> java-user-help@lucene.apache.org >> >> > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org