lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Read" <mr...@dircon.co.uk>
Subject FW: Segments not merging on delete
Date Wed, 24 Oct 2001 17:35:02 GMT
Hi, I'm having some problems with segments failing to merge when deleting
documents from the index. I've followed the recommendations in the FAQ for
avoiding a complete index rebuild when adding a new document, i.e. I'm
deleting the document from the index and the re-adding it. The index however
is growing even if I just replace the same document repeatedly. And although
the reader.numDocs() method returns the correct value the reader.docFreq()
for each term is increasing.

I'd appreciate any help. Thanks.

When I index a document, I have this code:

// check if it's already there and if it is, delete it
fsd = FSDirectory.getDirectory(indexPath, false);
reader = IndexReader.open(fsd);
Term t = new Term("original_path", f.getPath());
if (reader.delete(t) > 0) {
	out.println("This file was already in the index, replacing it.<br>");
}

// add document to index
writer = new IndexWriter(indexPath, new MyAnalyzer(), false);
Document doc = new Document();
doc.add(Field.Keyword("original_path", f.getPath()));
doc.add(Field.Text("filename", f.getName()));
doc.add(Field.Text("description", "a description"));
writer.addDocument(doc);

// clean up
fsd.close();
reader.close();
writer.close();

I then have a class to examine the documents in the index. When I first add
a document it appears correctly, however as I re-add (delete then add
documents as above), the reader.isDeleted() flag remains set for each of the
re-added documents which would be ok assuming segments have not been merged
yet but the re-added documents do not appear anywhere in the
reader.documents(i) collection. The code to examine is as follows:

fsd = FSDirectory.getDirectory(indexPath, false);
reader = IndexReader.open(fsd);

// show document details
for (int i = 0; i < reader.numDocs(); i ++) {
	if (!reader.isDeleted(i)) {
		Document d = reader.document(i);
		for (Enumeration e = d.fields(); e.hasMoreElements() ;) {
			Field f = (Field) e.nextElement();
			out.println(f.toString() + ": " + f.name() + "=" + f.stringValue() +
"<br>");
		}
	}
}

// clean up
fsd.close();
reader.close();
writer.close();


Mime
View raw message