Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 64079 invoked from network); 20 Dec 2006 15:59:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Dec 2006 15:59:17 -0000 Received: (qmail 71308 invoked by uid 500); 20 Dec 2006 15:59:17 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 71281 invoked by uid 500); 20 Dec 2006 15:59:17 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 71270 invoked by uid 99); 20 Dec 2006 15:59:16 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Dec 2006 07:59:16 -0800 X-ASF-Spam-Status: No, hits=2.3 required=10.0 tests=HTML_MESSAGE,MAILTO_TO_SPAM_ADDR,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of erickerickson@gmail.com designates 66.249.92.172 as permitted sender) Received: from [66.249.92.172] (HELO ug-out-1314.google.com) (66.249.92.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Dec 2006 07:59:06 -0800 Received: by ug-out-1314.google.com with SMTP id k40so2364692ugc for ; Wed, 20 Dec 2006 07:58:45 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=bRmHIowQNCeFsBl2YTfR1bPhRlHslx78ZIDvgvi8a2iwfeDDlMY4XgdxcZYi0qIcVe5c5dey4/J5/3Qu2fxVs+bnz2sRkwn5gJWWyobSXZ53QtoC3F9c2sYdXOUeFnzYHbq9edfV6g6taqi9AnIr75d/U8d602SOB9jIPxi+vkM= Received: by 10.82.113.6 with SMTP id l6mr1751189buc.1166630323993; Wed, 20 Dec 2006 07:58:43 -0800 (PST) Received: by 10.82.162.20 with HTTP; Wed, 20 Dec 2006 07:58:43 -0800 (PST) Message-ID: <359a92830612200758h36982da9u749b03bc370ec759@mail.gmail.com> Date: Wed, 20 Dec 2006 10:58:43 -0500 From: "Erick Erickson" To: java-user@lucene.apache.org Subject: Re: MultiFieldQueryParser doesn't properly filter out documents when the query string specifies to exclude certain terms In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_27257_19343196.1166630323923" References: X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_27257_19343196.1166630323923 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline My first question is how many documents would you be deleting on a pass for option 2? If it's 10 documents out of 10,000, I'd consider just deleting them and re-adding (see IndexModifier). Personally, if posible, I prefer your first option, building a completely new index and switching between them. This is especially useful if something catastrophic happens to the index as you build it and it winds up being unusable (power failures *do* happen). You can keep using your old index and be happy. Another question is how quickly the index builds and how soon do your users require that they get up-to-date data? And remember that no matter what, you must re-open your searcher to see the updates. I'd be really reluctant to remove all the items and re-build the index for several reasons... 1> You wouldn't get the new data being added until you closed/reopened your searcher. 2> The documents you deleted wouldn't be "gone" until you closed/reopened your searcher. 3> In the interim, your users wouldn't have access to much of anything.... Best Erick On 12/20/06, Adam Fleming wrote: > > > Hello Gentlemen (+Ladies?), > > I'm integrating Lucene into a Spring web-app, and have found a plethora of > great web + print resources to make the integration quick and seamless. One > thing that I have been hard-pressed to find is a good solution for > rebuilding the index on a regular basis. > > I'm curious if a you know of a best-practice (or have found something > personally that works) for rebuilding a Lucene Index w/o service > interruptions. The assumptions are a spring IOC container w/ an > IndexFactory bean. I have the project configured to work with both > FSDirectory and RamDirectory implementations. If you don't know Spring, > you are free to ignore the details - I'll adapt your comments to my code :) > > So far I tried rebuilding the index on a regular schedule, but foolishly > only added duplicate documents to an existing index. > > Things I have considered are > - Using two index directories, and rebuilding one while the other is > in use + switching when the rebuilt index is ready. This would > cause the app to alternate between two indexes. > - Using a single index, and iterating over the index entirely, > deleting documents 1 by 1 and re-adding them with fresh data > - Using a single index, and deleting ALL the documents at once > and then adding them all back as quickly as possible. > > > All of my proposed ideas seem fly in the face of Lucene's sipmlicity, and > I will be so thankful to be pointed in the right direction. > > > Happy Holidays and a big Thank You to the active list users, > > > Adam Fleming > > _________________________________________________________________ > Try amazing new 3D maps > http://maps.live.com/?wip=51 > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > ------=_Part_27257_19343196.1166630323923--