Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 41637 invoked from network); 7 Jul 2006 22:54:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 7 Jul 2006 22:54:40 -0000 Received: (qmail 15611 invoked by uid 500); 7 Jul 2006 22:54:35 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 15575 invoked by uid 500); 7 Jul 2006 22:54:35 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 15562 invoked by uid 99); 7 Jul 2006 22:54:35 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Jul 2006 15:54:35 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of DORONC@il.ibm.com designates 195.212.29.150 as permitted sender) Received: from [195.212.29.150] (HELO mtagate1.de.ibm.com) (195.212.29.150) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Jul 2006 15:54:33 -0700 Received: from d12nrmr1607.megacenter.de.ibm.com (d12nrmr1607.megacenter.de.ibm.com [9.149.167.49]) by mtagate1.de.ibm.com (8.13.6/8.13.6) with ESMTP id k67MsCob027064 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=FAIL) for ; Fri, 7 Jul 2006 22:54:12 GMT Received: from d12av02.megacenter.de.ibm.com (d12av02.megacenter.de.ibm.com [9.149.165.228]) by d12nrmr1607.megacenter.de.ibm.com (8.13.6/NCO/VER7.0) with ESMTP id k67MuxTH117208 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 8 Jul 2006 00:56:59 +0200 Received: from d12av02.megacenter.de.ibm.com (loopback [127.0.0.1]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id k67MsBbQ007791 for ; Sat, 8 Jul 2006 00:54:11 +0200 Received: from d12mc102.megacenter.de.ibm.com (d12mc102.megacenter.de.ibm.com [9.149.167.114]) by d12av02.megacenter.de.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id k67MsBxv007788 for ; Sat, 8 Jul 2006 00:54:11 +0200 In-Reply-To: <5225348.post@talk.nabble.com> Subject: Re: modify existing non-indexed field To: java-user@lucene.apache.org X-Mailer: Lotus Notes Release 7.0 HF144 February 01, 2006 Message-ID: From: Doron Cohen Date: Fri, 7 Jul 2006 15:52:22 -0700 X-MIMETrack: Serialize by Router on D12MC102/12/M/IBM(Release 7.0.1HF123 | April 14, 2006) at 08/07/2006 01:56:58 MIME-Version: 1.0 Content-type: text/plain; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N > dan2000 wrote on 07/07/2006 15:03:35: > but if you remove it and add it again, you'll need to re-index it again. > don't you? When you do re-index, you'll have to close the reader, which > would pause the search. Any better way of doint it? INHO yes and no - There's no need to close the searcher (or the reader associated with the searcher) in order to add that doc again, however it is required to close the reader that is used for deleting a document in order to open a writer for adding the updated document. Also, in order for this document to become visible to search, a new searcher must be opened. This does not necessarily means "pausing the search" - it is possible to open a new searcher in the background (as done in Solr) and when it is ready - to switch to serve new queries with the new searcher and close (and free) the old searcher once its serviced queries are "completed". Solr ( http://incubator.apache.org/solr) does something similar, and it also warms newly created searchers. If the the update of these non-idexed field happens a lot, perhaps a side data store would be more adequate, i.e. Lucene would store a unique docID (non-indexed field) for each document, and at search time that ID would be used to look-up the (updatable) values in the side data store. This means additional code complexity and, search performance penalty if many result documents are of interest for every search. The approach of updating the documents moves the overhead to indexing while making search faster and the application code simpler. If this is the selected approach, especially if the updates cannot be easily "batched" (i.e. delete 1000 docs to be updated, add back those updated 1000 docs), then the following may be of interest - a related proposal to "allow updating documents with no need to repeatedly open/close reader/writer": - http://issues.apache.org/jira/browse/LUCENE-565 - http://www.gossamer-threads.com/lists/lucene/java-dev/35317 Hope this helps, Doron --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org