lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Delete is not multi-thread safe
Date Thu, 31 Jan 2002 20:44:51 GMT

--- Dmitry Serebrennikov <dmitrys@earthlink.net> wrote:
> Doug Cutting wrote:
> 
> >
> >How awkward is it to open a reader, delete a document, close it,
> open a
> >writer, add a document, and then close the writer?  If that's really
> too
> >much work, we could add a utility method to enacapsulate it. 
> However, if
> >you're updating more than a single document, its much more efficient
> to
> >first do all the deletions, then do all the additions.  
> >
> That's just it - while you are busy re-crawling a web site (which can
> 
> take some substantial time), there will exist a situation when the
> user 
> will not find any documents from that web site - neither old, nor
> new. 
> Maybe the answer is to re-crawl in a different index directory and
> then move the files...

That would be one way.
Another way would be to write your crawling application so that it adds
newly crawled pages to the index in batches (e.g. every X URLs crawled,
 take a moment to add the pages to the index).

The ability to update documents in a single step would be great,
especially for applications such as web crawlers.

Otis


__________________________________________________
Do You Yahoo!?
Great stuff seeking new owners in Yahoo! Auctions! 
http://auctions.yahoo.com

--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message