lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sindri Traustason" <sind...@vyre.com>
Subject RE: Should I reindex with remove then add or can I add then remove?
Date Mon, 03 Apr 2006 09:59:15 GMT
Hi again

I guess I'll have to rephrase the question.

Is it ok to store a document number returned by TermDocs.doc() during
one IndexWriter.addDocument() and then performing a IndexReader.delete()
using that document number?

Sindri

> -----Original Message-----
> From: Sindri Traustason [mailto:sindrit@vyre.com] 
> Sent: 29 March 2006 17:19
> To: java-user@lucene.apache.org
> Subject: Should I reindex with remove then add or can I add 
> then remove?
> 
> Hi!
> 
> I have a question about how I should go about reindexing an 
> existing record in an index.
> 
> Currently my method that reindexes items is like this:
> 
> 	public void updateInIndex( Item item ) throws IOException{
> 		Document doc = ItemDocumentFactory.createDocument(item);
> 		// Remove the document from search index
> 		Term term = new Term(ItemDocumentFactory.ITEM, 
> item.getId());
> 		getIndexReader();
> 		indexReader.delete(term);
> 		// Remove the document from search index
> 		getIndexWriter();
> 		indexWriter.addDocument(doc);
> 	}
> 
> getIndexReader closes the field variable indexWriter and 
> opens indexReader and vice versa for get Index writer.  The 
> problem with this is that it leaves the index in a state 
> where the given item is not in the index (this can be seconds 
> for large items).
> 
> The suggested solution is like this:
> 
> 	public void updateInIndex( Item item ) throws IOException{
> 		Document doc = ItemDocumentFactory.createDocument(item);
> 		Term term = new Term(ItemDocumentFactory.ITEM, 
> item.getId());
> 		getIndexReader();
> 		// Find the old document
> 		TermDocs termDocs = indexReader.termDocs(term);
> 		int docNum = -1;
> 		if(termDocs.next()){
> 			docNum = termDocs.doc();
> 		}
> 		getIndexWriter();
> 		indexWriter.addDocument(doc);
> 		getIndexReader();
> 		// Remove the document from search index
> 		if(docNum!=-1){
> 			indexReader.delete(docNum);
> 		}
> 	}
> 
> But what is frightening me here is the sentence "Clients 
> should thus not rely on a given document having the same 
> number between sessions." in 
> http://lucene.apache.org/java/docs/api/org/apache/lucene/index
> /IndexRead
> er.html
> 
> My index is only accessed from this class and this class is 
> private to a singleton thread that queues up index tasks 
> (add, remove, update, optimize).  
> 
> So the question is: Since I can guarantee nothing else is 
> updating the index can the second index reader be considered 
> to be the same session and therefore the docNum for the old 
> document still valid?
> 
> I have done considerable tests on this and this seems to 
> always work as intended.
> 
> Cheers! And thanks in advance.
> 
> Sindri Traustason
> Senior Software Engineer
> VYRE
> http://www.vyre.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message