Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 54970 invoked from network); 4 Jul 2006 14:58:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 4 Jul 2006 14:58:26 -0000 Received: (qmail 40373 invoked by uid 500); 4 Jul 2006 14:57:48 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 40312 invoked by uid 500); 4 Jul 2006 14:57:47 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 40253 invoked by uid 99); 4 Jul 2006 14:57:47 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jul 2006 07:57:47 -0700 X-ASF-Spam-Status: No, hits=0.5 required=10.0 tests=DNS_FROM_RFC_ABUSE,HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of tobias382@gmail.com designates 64.233.182.189 as permitted sender) Received: from [64.233.182.189] (HELO nf-out-0910.google.com) (64.233.182.189) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jul 2006 07:57:45 -0700 Received: by nf-out-0910.google.com with SMTP id m19so904744nfc for ; Tue, 04 Jul 2006 07:57:24 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=mrDtEDeY9l78p3/98nWUNnp0yZZHKUMM1mSw1XZQr2btagDpTA+qwmLkEyfI1PzkWReXdEwERhabETFtIMVaJVdsrv60ca4ojVKaW8MqegphChiD0b8LtFfQBufKyAa0UewdcNHm/zt4qzq8hdmFG8H2UL3bxZ6uM1o/YW3q3vk= Received: by 10.78.166.7 with SMTP id o7mr1342256hue; Tue, 04 Jul 2006 07:57:23 -0700 (PDT) Received: by 10.78.202.8 with HTTP; Tue, 4 Jul 2006 07:57:23 -0700 (PDT) Message-ID: <55ed38ea0607040757y418aa5beo7f7e7be7b23abe8a@mail.gmail.com> Date: Tue, 4 Jul 2006 09:57:23 -0500 From: "Matthew Turland" To: java-user@lucene.apache.org Subject: Re: Lucene and database In-Reply-To: <5e966dac0607040749o21e0d212m780e3a652e8aa069@mail.gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_20251_22508174.1152025043801" References: <5e966dac0607040749o21e0d212m780e3a652e8aa069@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N ------=_Part_20251_22508174.1152025043801 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline If I'm understanding you correctly, you're using Lucene to store IDs and index the columns that would normally be full-text indices in MySQL, then use the IDs to retrieve the information from the database. This is more of a MySQL issue than a Lucene issue, but it suggests a flaw in your approach: why not simply store all the data you want to retrieve in Lucene? If the data in your database changes, you would have to rebuild your Lucene index anyway. My $0.02. On 7/4/06, Alexander Mashtakov wrote: > > Hi folks, > > I'm looking for a solution/best practices concerning Lucene and SQL > database > integration. > The database (MySQL) is already developed and contains data. I've tried > MySQL full-text > search, but it's quite slow and doesn't have the possibility to intergate > custom analyzers. > Phrase search is perfomed only in boolean mode and doesn't return > relevance > factor :( > > The idea is to manage full-text indexes (titles, keywords, summaries) and > perform search > using Lucene. The resultset will include ID's that will be appended to SQL > query in order > to apply additional filters based on foreign keys (categories mappings, > etc). > > But, the database is going to be big enough, and the list of IDs returned > by > Lucene too. This > may cause high memory usage and slow sql query speed (for instance 1000 > IDs > in "IN (id1, id2 ...)" > sql filter) > > > Any ideas, suggestions ? > > -- > Thank you, > /Alexander > > ------=_Part_20251_22508174.1152025043801--