lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maxim Patramanskij <...@osua.de>
Subject Re: release & migration plan
Date Tue, 13 Jul 2004 12:14:56 GMT
Hello Doug.

There are a lot of Lucene classes still use Vector & Hashtable instead
of ArrayList and HashMap because of compatibility reason with java 1.

Since the changes, proposed and made by Aviran to FieldInfos class
made Lucene java 1 incompatible, but can give us some reasonable
performance gain, shouldn't we go ahead with the whole Hashtable ->
HashMap and Vector -> ArrayList replacement arround the code to have
even more performance in other places?

Max

DC> I think perhaps it is time to make some incompatible changes to Lucene's 
DC> API.  There are a number of places where it is showing its age.  I'd 
DC> like to try to make as many API changes at once as is possible, so that 
DC> folks only have to port application code once.

DC> I propose we do this as follows:

DC> 1. Make a 1.9 release which has all the new APIs and deprecates all the 
DC> outdated APIs.  Existing applications should compile and run fine, but 
DC> with lots of deprecation warnings.

DC> 2. Make a 2.0 release which removes all deprecated code.

DC> Thus 1.9 would be a migration release.  Before an application is moved 
DC> to 2.0, folks should first make sure that it compiles against 1.9 
DC> without deprecation warnings.  Once it does then it should move to 2.0 
DC> without incident.

DC> Does this sound like a good plan?

DC> What changes would I like to see in the API?  Here are a few candidates:

DC> 1. Replace Field factory methods (Field.Text, Field.Keyword, etc.) with 
DC> a few methods that use type-safe enumerations, as described in:

DC> http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg08479.html

DC> 2. Similarly, replace BooleanQuery.add() with a type-safe enumeration, 
DC> also as described in:

DC> http://www.mail-archive.com/lucene-user@jakarta.apache.org/msg08479.html

DC> 3. Replace public IndexWriter fields (mergeFactor, minMergeDocs, etc.) 
DC> with get/set accessors.  Also, minMergeDocs should be renamed 
DC> maxBufferedDocs.

DC> 4. Rename PhrasePrefixQuery to be something like MultiPhraseQuery.  Also 
DC> make MultipleTermPositions a private nested class of this, as this is 
DC> the only place MultipleTermPositions is used.

DC> 5. Rename InputSteam to IndexInput and OutputStream to IndexOutput. 
DC> Also make both of these interfaces and add BufferedIndexInput and 
DC> BufferedIndexOutput as the implementation used by FSDirectory, 
DC> RAMDirectory, etc.  This would permit unbuffered and native 
DC> implementations (e.g., that use mmap) that could potentially speed 
DC> things considerably.

DC> 6. Replace DateField with something that formats dates suitably for 
DC> RangeQuery.

DC> 7. Move language-specific analyzers into separate downloads?

DC> 8. Add support for span queries to query parser?

DC> Do you have other candidates?

DC> Doug

DC> ---------------------------------------------------------------------
DC> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
DC> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message