lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bernhard Messer <Bernhard.Mes...@intrafind.de>
Subject Re: IndexReader.getCurrentVersion() and IndexReader.lastModified()
Date Thu, 03 Jun 2004 10:15:21 GMT
Hi Dmitry,

from the view of keeping the interface clean, it would be much better to 
have a seperate method in IndexReader like "isCurrent()" or even nicer 
"isValid()" which combines the system time of the index creation (stored 
in SegmentInfos) and the current version number. I think the 
implementation is not do difficult and can be done in a short period of 
time. If wanted, i can try provide a new patch implementing a new method 
in IndexReader "isValid()" which does exactly that.

Bernhard

Dmitry Serebrennikov wrote:

> Well, I know I didn't think of this case back when we were discussion 
> this change. As a recap, the issue was mainly that on some 
> architectures, the clock was not granular enough to detect updates 
> reliably, so some test cases were failing some of the time. You are 
> right, Bernhard, we didn't consider longer running systems where 
> entire indexes might be deleted and recreated while the cache was 
> still around.
>
> I don't know, having version start out as a date and then get 
> incremented as a version leaves a bad taste in my mouth somehow. At 
> the time, we discussed other ideas that would use the date "most of 
> the time" but would increment it explicitly if the clock was seen as 
> not being granular enough. But the simple 0-based version number was 
> seen as a much cleaner and superior solution when it was proposed.
>
> Perhaps it would be cleaner to leave the version number 0-based and 
> add an index creation date that would be explicitly available? This 
> would mean that checking index validity would require checking the 
> date and then the version. I would guess that only some applications 
> or general purpose cache implementations would have to go to such an 
> extent, while the majority can continue using just the 
> getCurrentVersion() method by itself. How does this sound? Is there 
> (should there be) an isCurrent() method on the IndexReader that could 
> encapsulate this process?
>
> Dmitry.
>
>
> Bernhard Messer wrote:
>
>> Hi,
>>
>> I'm sending a patch which should help to fix a problem using the new 
>> method IndexReader.getCurrentVersion(). As far as i understand the 
>> current lucene documentation, developers should use this new method 
>> to verify if an index is out of date. The older method 
>> IndexReader.lastModified() is deprecated and therefore a possible 
>> candidate for deletion.
>>
>> The problem with getCurrentVersion is, that it's base is 0, when 
>> creating a new index. Therefore the version number will be identical 
>> if you delete an index and recreate a new one,  using the same 
>> document set, doesn't matter if there is a change in the document 
>> content or a different analyzer is used. The idea of the patch is to 
>> intialize the version number with the current time in millis as base 
>> when creating a new SegmentInfos object. So it's "nearly" impossible 
>> to get the same version number again.
>>
>> Without this patch, it's impossible for developers to store an 
>> IndexReader in cache and check it's validity thru getCurrentVersion.
>>
>> In the attachment is the patch and a JUnit TestCase which tests the 
>> scenario with a sample implementation for an IndexReader cache.
>>
>> As far as i can see, there are no negativ side effects when 
>> implementing this patch. But let's see what the lucene-specialists 
>> will see ;-)
>>
>> best regards
>> Bernhard
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> Index: src/java/org/apache/lucene/index/SegmentInfos.java
>> ===================================================================
>> RCS file: 
>> /home/cvspublic/jakarta-lucene/src/java/org/apache/lucene/index/SegmentInfos.java,v

>>
>> retrieving revision 1.5
>> diff -r1.5 SegmentInfos.java
>> 32c32,37
>> <   private long version = 0; //counts how often the index has been 
>> changed by adding or deleting docs
>> ---
>>  
>>
>>>  /**
>>>   * counts how often the index has been changed by adding or 
>>> deleting docs.
>>>   * starting with the current time in milliseconds forces to create 
>>> unique version numbers.
>>>   */
>>>  private long version = System.currentTimeMillis();
>>>   
>>
>>
>>  
>>
>> ------------------------------------------------------------------------
>>
>>
>> package org.apache.lucene.index;
>>
>> /**
>> * Copyright 2004 The Apache Software Foundation
>> *
>> * Licensed under the Apache License, Version 2.0 (the "License");
>> * you may not use this file except in compliance with the License.
>> * You may obtain a copy of the License at
>> *
>> *     http://www.apache.org/licenses/LICENSE-2.0
>> *
>> * Unless required by applicable law or agreed to in writing, software
>> * distributed under the License is distributed on an "AS IS" BASIS,
>> * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 
>> implied.
>> * See the License for the specific language governing permissions and
>> * limitations under the License.
>> */
>>
>> import java.io.IOException;
>> import java.util.Hashtable;
>>
>> import junit.framework.Test;
>> import junit.framework.TestCase;
>> import junit.framework.TestSuite;
>>
>> import org.apache.lucene.analysis.Analyzer;
>> import org.apache.lucene.analysis.SimpleAnalyzer;
>> import org.apache.lucene.document.Document;
>> import org.apache.lucene.document.Field;
>> import org.apache.lucene.index.IndexReader;
>> import org.apache.lucene.index.IndexWriter;
>> import org.apache.lucene.queryParser.QueryParser;
>> import org.apache.lucene.search.Hits;
>> import org.apache.lucene.search.IndexSearcher;
>> import org.apache.lucene.search.Query;
>> import org.apache.lucene.search.Searcher;
>> import org.apache.lucene.store.Directory;
>> import org.apache.lucene.store.FSDirectory;
>>
>> class CachedIndex { // an entry in the cache
>>     IndexReader reader;
>>     long version;
>>
>>     CachedIndex(String name) throws IOException {
>>         version = IndexReader.getCurrentVersion(name);
>>         reader = IndexReader.open(name); // open reader
>>     }
>> }
>>
>> public class TestIndexReaderVersion extends TestCase {
>>     
>>     public TestIndexReaderVersion (String name) {
>>         super(name);
>>     }
>>
>>     static final Hashtable indexCache = new Hashtable();
>>        
>>     public static Test suite () {
>>         TestSuite suite = new TestSuite(TestIndexReaderVersion.class);
>>        
>>         for (int i = 1; i < 100; i++)
>>             suite.addTest(new TestSuite(TestIndexReaderVersion.class));
>>        
>>         return suite;
>>     }
>>     
>>     public void testVersion() {
>>
>>         Analyzer analyzer = new SimpleAnalyzer();
>>         String name = "/tmp/lucy";
>>
>>         String[] docs = { "a", "a b" };
>>         String[] titles = docs;
>>         String q = "+a +b";
>>        
>>         testVersionControl(analyzer, name, docs, titles, q);
>>
>>         String[] docs2 = { "c", "c d" };
>>         String[] titles2 = docs;
>>         q = "+c +d";
>>        
>>         testVersionControl(analyzer, name, docs2, titles2, q);
>>
>>     }
>>
>>     synchronized private IndexReader getReader(String name) {
>>         CachedIndex index =
>>             (CachedIndex) indexCache.get(name);
>>         // look in cache
>>
>>         try {
>>             if (index != null
>>                 // check up-to-date
>>                 && index.version == 
>> IndexReader.getCurrentVersion(name)) {
>>                     //System.out.println("IndexReader cache hit 
>> (maxDocs=" + index.reader.maxDoc() + ")");
>>                 return index.reader; // cache hit
>>                
>>             } else {
>>                 // Index was open but is not up-to-date, close it 
>> before creating a new one
>>                 if (index != null) {
>>                     //System.out.println(
>>                     //    "IndexReader not up-to-date, creating new");
>>                     try {
>>                         index.reader.close();
>>                     } catch (IOException ignore) {
>>                         System.err.println(
>>                             "IndexReader was already closed by third 
>> party.");
>>                     }
>>                 } else {
>>                     //System.out.println(
>>                     //    "IndexReader does not exist, creating new");
>>                 }
>>                 index = new CachedIndex(name); // cache miss
>>             }
>>         } catch (IOException e) {
>>             System.err.println(e);
>>         }
>>
>>         indexCache.put(name, index); // add to cache
>>         return index.reader;
>>     }
>>
>>     private void testVersionControl(
>>         Analyzer analyzer,
>>         String indexName,
>>         String[] docs,
>>         String[] titles,
>>         String queryString) {
>>         try {
>>
>>             assertEquals(docs.length, titles.length);
>>
>>             Directory directory = FSDirectory.getDirectory(indexName, 
>> true);
>>             IndexWriter indexer = new IndexWriter(directory, 
>> analyzer, true);
>>             indexer.setUseCompoundFile(true);
>>            
>>             //for (int y = 0; y < 500; y++)
>>             for (int z = 0; z < docs.length; z++) {
>>                 Document d = new Document();
>>
>>                 Field field = new Field("body", docs[z], true, true, 
>> true);
>>                 d.add(field);
>>
>>                 field = new Field("title", titles[z], true, true, true);
>>                 d.add(field);
>>
>>                 indexer.addDocument(d);
>>             }
>>
>>             indexer.optimize();
>>             indexer.close();
>>            
>>             Hits hits = null;
>>             QueryParser parser = new QueryParser("body", analyzer);
>>            
>>             /** try to get an reader from cache */
>>             IndexReader reader = getReader(indexName);
>>            
>>             /** create a new searcher */
>>             Searcher searcher = new IndexSearcher(reader);           
>>            
>>             Query query = parser.parse(queryString);
>>             hits = searcher.search(query);
>>             //System.out.println(" doc's found: " + hits.length());
>>            
>>             assertEquals (1, hits.length());
>>
>>             searcher.close();
>>
>>         } catch (Exception e) {
>>             System.out.println(
>>                 " caught a "
>>                     + e.getClass()
>>                     + "\n with message: "
>>                     + e.getMessage());
>>         }
>>     }
>> }
>>
>>  
>>
>> ------------------------------------------------------------------------
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message