lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Serebrennikov <dmit...@earthlink.net>
Subject Re: IndexReader.getCurrentVersion() and IndexReader.lastModified()
Date Wed, 02 Jun 2004 17:14:17 GMT
Well, I know I didn't think of this case back when we were discussion 
this change. As a recap, the issue was mainly that on some 
architectures, the clock was not granular enough to detect updates 
reliably, so some test cases were failing some of the time. You are 
right, Bernhard, we didn't consider longer running systems where entire 
indexes might be deleted and recreated while the cache was still around.

I don't know, having version start out as a date and then get 
incremented as a version leaves a bad taste in my mouth somehow. At the 
time, we discussed other ideas that would use the date "most of the 
time" but would increment it explicitly if the clock was seen as not 
being granular enough. But the simple 0-based version number was seen as 
a much cleaner and superior solution when it was proposed.

Perhaps it would be cleaner to leave the version number 0-based and add 
an index creation date that would be explicitly available? This would 
mean that checking index validity would require checking the date and 
then the version. I would guess that only some applications or general 
purpose cache implementations would have to go to such an extent, while 
the majority can continue using just the getCurrentVersion() method by 
itself. How does this sound? Is there (should there be) an isCurrent() 
method on the IndexReader that could encapsulate this process?

Dmitry.


Bernhard Messer wrote:

> Hi,
>
> I'm sending a patch which should help to fix a problem using the new 
> method IndexReader.getCurrentVersion(). As far as i understand the 
> current lucene documentation, developers should use this new method to 
> verify if an index is out of date. The older method 
> IndexReader.lastModified() is deprecated and therefore a possible 
> candidate for deletion.
>
> The problem with getCurrentVersion is, that it's base is 0, when 
> creating a new index. Therefore the version number will be identical 
> if you delete an index and recreate a new one,  using the same 
> document set, doesn't matter if there is a change in the document 
> content or a different analyzer is used. The idea of the patch is to 
> intialize the version number with the current time in millis as base 
> when creating a new SegmentInfos object. So it's "nearly" impossible 
> to get the same version number again.
>
> Without this patch, it's impossible for developers to store an 
> IndexReader in cache and check it's validity thru getCurrentVersion.
>
> In the attachment is the patch and a JUnit TestCase which tests the 
> scenario with a sample implementation for an IndexReader cache.
>
> As far as i can see, there are no negativ side effects when 
> implementing this patch. But let's see what the lucene-specialists 
> will see ;-)
>
> best regards
> Bernhard
>
>
>
>
>
>------------------------------------------------------------------------
>
>Index: src/java/org/apache/lucene/index/SegmentInfos.java
>===================================================================
>RCS file: /home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/index/SegmentInfos.java,v
>retrieving revision 1.5
>diff -r1.5 SegmentInfos.java
>32c32,37
><   private long version = 0; //counts how often the index has been changed by adding
or deleting docs
>---
>  
>
>>  /**
>>   * counts how often the index has been changed by adding or deleting docs.
>>   * starting with the current time in milliseconds forces to create unique version
numbers.
>>   */
>>  private long version = System.currentTimeMillis();
>>    
>>
>
>  
>
>------------------------------------------------------------------------
>
>
>package org.apache.lucene.index;
>
>/**
> * Copyright 2004 The Apache Software Foundation
> *
> * Licensed under the Apache License, Version 2.0 (the "License");
> * you may not use this file except in compliance with the License.
> * You may obtain a copy of the License at
> *
> *     http://www.apache.org/licenses/LICENSE-2.0
> *
> * Unless required by applicable law or agreed to in writing, software
> * distributed under the License is distributed on an "AS IS" BASIS,
> * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> * See the License for the specific language governing permissions and
> * limitations under the License.
> */
>
>import java.io.IOException;
>import java.util.Hashtable;
>
>import junit.framework.Test;
>import junit.framework.TestCase;
>import junit.framework.TestSuite;
>
>import org.apache.lucene.analysis.Analyzer;
>import org.apache.lucene.analysis.SimpleAnalyzer;
>import org.apache.lucene.document.Document;
>import org.apache.lucene.document.Field;
>import org.apache.lucene.index.IndexReader;
>import org.apache.lucene.index.IndexWriter;
>import org.apache.lucene.queryParser.QueryParser;
>import org.apache.lucene.search.Hits;
>import org.apache.lucene.search.IndexSearcher;
>import org.apache.lucene.search.Query;
>import org.apache.lucene.search.Searcher;
>import org.apache.lucene.store.Directory;
>import org.apache.lucene.store.FSDirectory;
>
>class CachedIndex { // an entry in the cache
>	IndexReader reader;
>	long version;
>
>	CachedIndex(String name) throws IOException {
>		version = IndexReader.getCurrentVersion(name);
>		reader = IndexReader.open(name); // open reader
>	}
>}
>
>public class TestIndexReaderVersion extends TestCase {
>	
>	public TestIndexReaderVersion (String name) {
>		super(name);
>	}
>
>	static final Hashtable indexCache = new Hashtable();
>		
>	public static Test suite () {
>		TestSuite suite = new TestSuite(TestIndexReaderVersion.class);
>		
>		for (int i = 1; i < 100; i++)
>			suite.addTest(new TestSuite(TestIndexReaderVersion.class));
>		
>		return suite;
>	}
>	
>	public void testVersion() {
>
>		Analyzer analyzer = new SimpleAnalyzer();
>		String name = "/tmp/lucy";
>
>		String[] docs = { "a", "a b" };
>		String[] titles = docs;
>		String q = "+a +b";
>		
>		testVersionControl(analyzer, name, docs, titles, q);
>
>		String[] docs2 = { "c", "c d" };
>		String[] titles2 = docs;
>		q = "+c +d";
>		
>		testVersionControl(analyzer, name, docs2, titles2, q);
>
>	}
>
>	synchronized private IndexReader getReader(String name) {
>		CachedIndex index =
>			(CachedIndex) indexCache.get(name);
>		// look in cache
>
>		try {
>			if (index != null
>				// check up-to-date
>				&& index.version == IndexReader.getCurrentVersion(name)) {
>					//System.out.println("IndexReader cache hit (maxDocs=" + index.reader.maxDoc() +
")");
>				return index.reader; // cache hit
>				
>			} else {
>				// Index was open but is not up-to-date, close it before creating a new one
>				if (index != null) {
>					//System.out.println(
>					//	"IndexReader not up-to-date, creating new");
>					try {
>						index.reader.close();
>					} catch (IOException ignore) {
>						System.err.println(
>							"IndexReader was already closed by third party.");
>					}
>				} else {
>					//System.out.println(
>					//	"IndexReader does not exist, creating new");
>				}
>				index = new CachedIndex(name); // cache miss
>			}
>		} catch (IOException e) {
>			System.err.println(e);
>		}
>
>		indexCache.put(name, index); // add to cache
>		return index.reader;
>	}
>
>	private void testVersionControl(
>		Analyzer analyzer,
>		String indexName,
>		String[] docs,
>		String[] titles,
>		String queryString) {
>		try {
>
>			assertEquals(docs.length, titles.length);
>
>			Directory directory = FSDirectory.getDirectory(indexName, true);
>			IndexWriter indexer = new IndexWriter(directory, analyzer, true);
>			indexer.setUseCompoundFile(true);
>			
>			//for (int y = 0; y < 500; y++)
>			for (int z = 0; z < docs.length; z++) {
>				Document d = new Document();
>
>				Field field = new Field("body", docs[z], true, true, true);
>				d.add(field);
>
>				field = new Field("title", titles[z], true, true, true);
>				d.add(field);
>
>				indexer.addDocument(d);
>			}
>
>			indexer.optimize();
>			indexer.close();
>			
>			Hits hits = null;
>			QueryParser parser = new QueryParser("body", analyzer);
>			
>			/** try to get an reader from cache */
>			IndexReader reader = getReader(indexName);
>			
>			/** create a new searcher */
>			Searcher searcher = new IndexSearcher(reader);			
>			
>			Query query = parser.parse(queryString);
>			hits = searcher.search(query);
>			//System.out.println(" doc's found: " + hits.length());
>			
>			assertEquals (1, hits.length());
>
>			searcher.close();
>
>		} catch (Exception e) {
>			System.out.println(
>				" caught a "
>					+ e.getClass()
>					+ "\n with message: "
>					+ e.getMessage());
>		}
>	}
>}
>
>  
>
>------------------------------------------------------------------------
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message