From lucene-dev-return-6881-apmail-jakarta-lucene-dev-archive=jakarta.apache.org@jakarta.apache.org Fri Jul 16 12:31:18 2004 Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 7085 invoked from network); 16 Jul 2004 12:31:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 16 Jul 2004 12:31:18 -0000 Received: (qmail 14348 invoked by uid 500); 16 Jul 2004 12:31:14 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 14307 invoked by uid 500); 16 Jul 2004 12:31:14 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 14291 invoked by uid 99); 16 Jul 2004 12:31:14 -0000 X-ASF-Spam-Status: No, hits=1.3 required=10.0 tests=RCVD_BY_IP,SB_NEW_BULK,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received: from [64.233.170.207] (HELO mproxy.gmail.com) (64.233.170.207) by apache.org (qpsmtpd/0.27.1) with SMTP; Fri, 16 Jul 2004 05:31:12 -0700 Received: by mproxy.gmail.com with SMTP id d78so474797rnf for ; Fri, 16 Jul 2004 05:31:11 -0700 (PDT) Received: by 10.38.206.36 with SMTP id d36mr51156rng; Fri, 16 Jul 2004 05:31:11 -0700 (PDT) Message-ID: Date: Fri, 16 Jul 2004 14:31:11 +0200 From: Giulio Cesare Solaroli To: Lucene Developers List Subject: Re: Deleting a document with an IndexWriter open In-Reply-To: <40F7C11B.7020803@detego-software.de> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit References: <40F7C11B.7020803@detego-software.de> X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Hi Christoph, On Fri, 16 Jul 2004 13:50:51 +0200, Christoph Goller wrote: >[snip on good reasons why an IndexWriter can not delete documents] > >> > If you want to do several updates at the same time, the most efficient > way would be to: > > 1) Keep an IndexReader/Searcher open on your index in order to guarantee > reed access and a consistent index during the whole process. > > 2) Open a new IndexReader and delete all the documents that you want to > update. This is the main problem; in my current arrangement, it is quite difficult to find out the documents that needs to be updated in advance; it would have been much easier to find out whether every single document where a new entry or a document already present, and thus to update (instead of insert). I can try to work on finding a better way to list of updated documents, but I was hoping to solve this problem with a different route. [...] > > Do you confirm my idea that keeping and IndexWriter open as much as > > possible while indexing batch of documents is a "good thing"? > > Yes. IndexWriter works with a RamDirectory as cache. If you close > it after each document and open a new one, you enforce unnecessary > write operations to your hard disk. > > > Is there any option to ever see a deleteDocument method in the > > IndexWriter class > > Probably not. I guess you either have to update every document separately > as described in your email (open and close a reader and writer for each > document), or do it in the way I describe above (more efficient). I am not competent enough to suggest any possible solution for this problem, but I hope that developers with required knowledge will take this option into consideration for future versions of Lucene as it will really simplify some tasks to the users. --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org