Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@apache.org Received: (qmail 7062 invoked from network); 31 Jan 2002 20:44:52 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 31 Jan 2002 20:44:52 -0000 Received: (qmail 16547 invoked by uid 97); 31 Jan 2002 20:44:55 -0000 Delivered-To: qmlist-jakarta-archive-lucene-dev@jakarta.apache.org Received: (qmail 16530 invoked by uid 97); 31 Jan 2002 20:44:54 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 16519 invoked from network); 31 Jan 2002 20:44:54 -0000 Message-ID: <20020131204451.45559.qmail@web12705.mail.yahoo.com> Date: Thu, 31 Jan 2002 12:44:51 -0800 (PST) From: Otis Gospodnetic Subject: Re: Delete is not multi-thread safe To: Lucene Developers List In-Reply-To: <3C59A65B.7080201@earthlink.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N --- Dmitry Serebrennikov wrote: > Doug Cutting wrote: > > > > >How awkward is it to open a reader, delete a document, close it, > open a > >writer, add a document, and then close the writer? If that's really > too > >much work, we could add a utility method to enacapsulate it. > However, if > >you're updating more than a single document, its much more efficient > to > >first do all the deletions, then do all the additions. > > > That's just it - while you are busy re-crawling a web site (which can > > take some substantial time), there will exist a situation when the > user > will not find any documents from that web site - neither old, nor > new. > Maybe the answer is to re-crawl in a different index directory and > then move the files... That would be one way. Another way would be to write your crawling application so that it adds newly crawled pages to the index in batches (e.g. every X URLs crawled, take a moment to add the pages to the index). The ability to update documents in a single step would be great, especially for applications such as web crawlers. Otis __________________________________________________ Do You Yahoo!? Great stuff seeking new owners in Yahoo! Auctions! http://auctions.yahoo.com -- To unsubscribe, e-mail: For additional commands, e-mail: