lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: 1.4.x TermInfosWriter.indexInterval not public static ?
Date Mon, 28 Feb 2005 22:56:36 GMT
Chris Hostetter wrote:
>  1) If making it mutatable requires changes to other classes to propogate
>     it, then why is it now an instance variable instead of a static?
>     (Presumably making it an instance variable allows subclasses to
>     override the value, but if other classes have internal expectations
>     of the value, that doesn't seem safe)

Its an instance variable because it can vary from instance-to-instance. 
  This value is specified when an index segment is written, and 
subsequently read from disk and used when reading that segment.  It's an 
instance variable in both the writing and reading code.  The thing 
that's lacking is a way to pass in alternate values to the writing code.

The reason that other classes are involved is that the reading and 
writing code are in non-public classes.  We don't want to expose the 
implementation too much by making these public, but would rather expose 
these as getter/setter methods on the relevant public API.

>  2) Should it be configurable through a get/set method, or through a
>     system property?
>     (which rehashes the instance/global question)

That's indeed the question.  My guess is that a system property would be 
probably be sufficient for most, but perhaps not for all.  Similarly 
with a static setter/getter.  But a getter/setter on IndexWriter would 
make everyone happy.

>  3) Is it important that a writer updating an existing index use the same
>     value as the writer that initial created the index?  if so should
>     there really be a "preferedIndexInterval" variable which is mutatable,
>     and a "currentIndexInterval" which is set to the value of the index
>     currently being updated.  Such that preferedIndexInterval is used when
>     making an index from scratch and currentIndexInterval is used when
>     adding segments to a new index?

It's used whenever an index segment is created.  Index segments are 
created when documents are added and when index segments are merged to 
form larger index segments.  Merging happens frequently while indexing. 
  Optimization merges all segments.

The value can vary in each segment.

The default value is probably good for all but folks with very large 
indexes, who may wish to increase the default somewhat.  Also folks with 
smaller indexes and very high query volumes may wish to decrease the 
default.  It's a classic time/memory tradeoff.  Higher values use less 
memory and make searches a bit slower, smaller values use more memory 
and make searches a bit faster.

Unless there are objections I will add this as:
   IndexWriter.setTermIndexInterval()
   IndexWriter.getTermIndexInterval()
Both will be marked "Expert".

Further discussion should move to the lucene-dev list.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message