Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 2750 invoked from network); 22 Dec 2006 02:00:16 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 22 Dec 2006 02:00:16 -0000 Received: (qmail 7661 invoked by uid 500); 22 Dec 2006 01:59:56 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 7560 invoked by uid 500); 22 Dec 2006 01:59:52 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 7523 invoked by uid 99); 22 Dec 2006 01:59:50 -0000 X-ASF-Spam-Status: No, hits=4.3 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,FORGED_HOTMAIL_RCVD,MAILTO_TO_SPAM_ADDR,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: domain of aflem26@hotmail.com designates 65.54.246.143 as permitted sender) Received: from [65.54.246.143] (HELO bay0-omc2-s7.bay0.hotmail.com) (65.54.246.143) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Dec 2006 17:59:46 -0800 Received: from BAY109-W6 ([64.4.19.106]) by bay0-omc2-s7.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.2668); Thu, 21 Dec 2006 17:58:33 -0800 X-Originating-IP: [67.166.126.34] X-Originating-Email: [aflem26@hotmail.com] Message-ID: From: Adam Fleming To: Subject: RE: Rebuilding index on a regular basis Date: Thu, 21 Dec 2006 17:58:33 -0800 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginalArrivalTime: 22 Dec 2006 01:58:33.0325 (UTC) FILETIME=[AF6DC1D0:01C7256C] X-Virus-Checked: Checked by ClamAV on apache.org Hi Erick,=20 Thanks for the suggestion of using 2 indexes. The number of documents is s= mall - about 2000, and it builds quickly - about 3s from a database. I am = currently trying to rebuild every 2 minutes, but could probably reduce that= to 5. That could be as long as 10 minutes, but that's about the limit. Thanks,=20 Adam=20 ---------------------------------------- > Date: Wed, 20 Dec 2006 10:58:43 -0500 > From: erickerickson@gmail.com > To: java-user@lucene.apache.org > Subject: Re: MultiFieldQueryParser doesn't properly filter out documents = when the query string specifies to exclude certain terms >=20 > My first question is how many documents would you be deleting on a pass f= or > option 2? If it's 10 documents out of 10,000, I'd consider just deleting > them and re-adding (see IndexModifier). >=20 > Personally, if posible, I prefer your first option, building a completely > new index and switching between them. This is especially useful if someth= ing > catastrophic happens to the index as you build it and it winds up being > unusable (power failures *do* happen). You can keep using your old index = and > be happy. >=20 > Another question is how quickly the index builds and how soon do your use= rs > require that they get up-to-date data? >=20 > And remember that no matter what, you must re-open your searcher to see t= he > updates. >=20 > I'd be really reluctant to remove all the items and re-build the index fo= r > several reasons... > 1> You wouldn't get the new data being added until you closed/reopened yo= ur > searcher. > 2> The documents you deleted wouldn't be "gone" until you closed/reopened > your searcher. > 3> In the interim, your users wouldn't have access to much of anything...= . >=20 > Best > Erick >=20 > On 12/20/06, Adam Fleming wrote: > > > > > > Hello Gentlemen (+Ladies?), > > > > I'm integrating Lucene into a Spring web-app, and have found a plethora= of > > great web + print resources to make the integration quick and seamless.= One > > thing that I have been hard-pressed to find is a good solution for > > rebuilding the index on a regular basis. > > > > I'm curious if a you know of a best-practice (or have found something > > personally that works) for rebuilding a Lucene Index w/o service > > interruptions. The assumptions are a spring IOC container w/ an > > IndexFactory bean. I have the project configured to work with both > > FSDirectory and RamDirectory implementations. If you don't know Sprin= g, > > you are free to ignore the details - I'll adapt your comments to my cod= e :) > > > > So far I tried rebuilding the index on a regular schedule, but foolishl= y > > only added duplicate documents to an existing index. > > > > Things I have considered are > > - Using two index directories, and rebuilding one while the other is > > in use + switching when the rebuilt index is ready. This would > > cause the app to alternate between two indexes. > > - Using a single index, and iterating over the index entirely, > > deleting documents 1 by 1 and re-adding them with fresh data > > - Using a single index, and deleting ALL the documents at once > > and then adding them all back as quickly as possible. > > > > > > All of my proposed ideas seem fly in the face of Lucene's sipmlicity, a= nd > > I will be so thankful to be pointed in the right direction. > > > > > > Happy Holidays and a big Thank You to the active list users, > > > > > > Adam Fleming > > > > _________________________________________________________________ > > Try amazing new 3D maps > > http://maps.live.com/?wip=3D51 > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > For additional commands, e-mail: java-user-help@lucene.apache.org > > > > _________________________________________________________________ Try amazing new 3D maps http://maps.live.com/?wip=3D51= --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org