lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Song <delphi...@yahoo.com>
Subject efficient ways of updating document
Date Fri, 05 Jan 2007 05:53:12 GMT
It seems to me that updating a document is rather tedious and slow in lucene, especially for
updating large number of documents.  Before opening an IndexWriter to add documents, one has
to open an IndexReader/IndexSearcher to search for the document of a particular id.  Upon
finding its docnum, delete it.  One then close the reader and open the writer to add the updated
content.  Move on to the next item, repeat the process. For large number of updating, the
performance is bad.  

One way to speed it up is don't do this process for every single updates.  If one has a large
number of updates, open the reader/searcher first.  But instead of searching/deleting for
one document and then close it, search/delete all of them and then close the reader/searcher.
 Then one can open a writer and just do a batch add.

However, I wonder whether we could change the writer to do update automatically. Update will
be consists of marking an old document invalid, but not removing them and adding the new document.
 We can clean things up later on.  Using this method, there is no need of switching between
reader and writer during update.  


john



__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message