lucenenet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pnas...@apache.org
Subject svn commit: r1382418 [2/3] - in /incubator/lucene.net/branches/3.0.3: ACKNOWLEDGEMENTS.txt CHANGES.txt DISCLAIMER.txt README.txt src/ABOUT.txt src/BUILD.txt src/CHANGES.txt src/HISTORY.txt
Date Sun, 09 Sep 2012 08:08:26 GMT
Added: incubator/lucene.net/branches/3.0.3/CHANGES.txt
URL: http://svn.apache.org/viewvc/incubator/lucene.net/branches/3.0.3/CHANGES.txt?rev=1382418&view=auto
==============================================================================
--- incubator/lucene.net/branches/3.0.3/CHANGES.txt (added)
+++ incubator/lucene.net/branches/3.0.3/CHANGES.txt Sun Sep  9 08:08:26 2012
@@ -0,0 +1,4009 @@
+=================== 3.0.3 trunk (not yet released) =====================
+
+Bug
+•[LUCENENET-54] - ArgumentOurOfRangeException caused by SF.Snowball.Ext.DanishStemmer
+•[LUCENENET-420] - String.StartsWith has culture in it.
+•[LUCENENET-423] - QueryParser differences between Java and .NET when parsing range queries involving dates
+•[LUCENENET-445] - Lucene.Net.Index.TestIndexWriter.TestFutureCommit() Fails
+•[LUCENENET-464] - The Lucene.Net.FastVectorHighligher.dll of the latest release 2.9.4 breaks any ASP.NET application
+•[LUCENENET-472] - Operator == on Parameter does not check for null arguments
+•[LUCENENET-473] - Fix linefeeds in more than 600 files
+•[LUCENENET-474] - Missing License Headers in trunk after 3.0.3 merge
+•[LUCENENET-475] - DanishStemmer doesn't work.
+•[LUCENENET-476] - ScoreDocs in TopDocs is ambiguos when using Visual Basic .Net
+•[LUCENENET-477] - NullReferenceException in ThreadLocal when Lucene.Net compiled for .Net 2.0
+•[LUCENENET-478] - Parts of QueryParser are outdated or weren't previously ported correctly
+•[LUCENENET-479] - QueryParser.SetEnablePositionIncrements(false) doesn't work
+•[LUCENENET-483] - Spatial Search skipping records when one location is close to origin, another one is away and radius is wider
+•[LUCENENET-484] - Some possibly major tests intermittently fail 
+•[LUCENENET-485] - IndexOutOfRangeException in FrenchStemmer
+•[LUCENENET-490] - QueryParser is culture-sensitive
+•[LUCENENET-493] - Make lucene.net culture insensitive (like the java version)
+•[LUCENENET-494] - Port error in FieldCacheRangeFilter
+•[LUCENENET-495] - Use of DateTime.Now causes huge amount of System.Globalization.DaylightTime object allocations
+•[LUCENENET-500] - Lucene fails to run in medium trust ASP.NET Application
+
+Improvement
+•[LUCENENET-179] - SnowballFilter speed improvment
+•[LUCENENET-407] - Signing the assembly
+•[LUCENENET-408] - Mark assembly as CLS compliant; make AlreadyClosedException serializable
+•[LUCENENET-466] - optimisation for the GermanStemmer.vb‏
+•[LUCENENET-504] - FastVectorHighlighter - support for prefix query
+•[LUCENENET-506] - FastVectorHighlighter should use Query.ExtractTerms as fallback
+
+New Feature
+•[LUCENENET-463] - Would like to be able to use a SimpleSpanFragmenter for extrcting whole sentances 
+•[LUCENENET-481] - Port Contrib.MemoryIndex
+
+Task
+•[LUCENENET-446] - Make Lucene.Net CLS Compliant
+•[LUCENENET-471] - Remove Package.html and Overview.html artifacts
+•[LUCENENET-480] - Investigate what needs to happen to make both .NET 3.5 and 4.0 builds possible
+•[LUCENENET-487] - Remove Obsolete Members, Fields that are marked as obsolete and to be removed in 3.0
+•[LUCENENET-503] - Update binary names
+
+Sub-task
+•[LUCENENET-468] - Implement the Dispose pattern properly in classes with Close
+•[LUCENENET-470] - Change Getxxx() and Setxxx() methods to .NET Properties
+
+
+=================== 2.9.4 trunk =====================
+
+Bug fixes
+
+ * LUCENENET-355  [LUCENE-2387]: Don't hang onto Fieldables from the last doc indexed,
+   in IndexWriter, nor the Reader in Tokenizer after close is
+   called. (digy) [Ruben Laguna, Uwe Schindler, Mike McCandless]
+
+
+Change Log Copied from Lucene 
+======================= Release 2.9.2 2010-02-26 =======================
+
+Bug fixes
+
+ * LUCENE-2045: Fix silly FileNotFoundException hit if you enable
+   infoStream on IndexWriter and then add an empty document and commit
+   (Shai Erera via Mike McCandless)
+
+ * LUCENE-2088: addAttribute() should only accept interfaces that
+   extend Attribute. (Shai Erera, Uwe Schindler)
+
+ * LUCENE-2092: BooleanQuery was ignoring disableCoord in its hashCode
+   and equals methods, cause bad things to happen when caching
+   BooleanQueries.  (Chris Hostetter, Mike McCandless)
+
+ * LUCENE-2095: Fixes: when two threads call IndexWriter.commit() at
+   the same time, it's possible for commit to return control back to
+   one of the threads before all changes are actually committed.
+   (Sanne Grinovero via Mike McCandless)
+
+ * LUCENE-2166: Don't incorrectly keep warning about the same immense
+    term, when IndexWriter.infoStream is on.  (Mike McCandless)
+
+ * LUCENE-2158: At high indexing rates, NRT reader could temporarily
+   lose deletions.  (Mike McCandless)
+  
+ * LUCENE-2182: DEFAULT_ATTRIBUTE_FACTORY was failing to load
+   implementation class when interface was loaded by a different
+   class loader.  (Uwe Schindler, reported on java-user by Ahmed El-dawy)
+  
+ * LUCENE-2257: Increase max number of unique terms in one segment to
+   termIndexInterval (default 128) * ~2.1 billion = ~274 billion.
+   (Tom Burton-West via Mike McCandless)
+
+ * LUCENE-2260: Fixed AttributeSource to not hold a strong
+   reference to the Attribute/AttributeImpl classes which prevents
+   unloading of custom attributes loaded by other classloaders
+   (e.g. in Solr plugins).  (Uwe Schindler)
+ 
+ * LUCENE-1941: Fix Min/MaxPayloadFunction returns 0 when
+   only one payload is present.  (Erik Hatcher, Mike McCandless
+   via Uwe Schindler)
+
+ * LUCENE-2270: Queries consisting of all zero-boost clauses
+   (for example, text:foo^0) sorted incorrectly and produced
+   invalid docids. (yonik)
+
+ * LUCENE-2422: Don't reuse byte[] in IndexInput/Output -- it gains
+   little performance, and ties up possibly large amounts of memory
+   for apps that index large docs.  (Ross Woolf via Mike McCandless)
+
+API Changes
+
+ * LUCENE-2190: Added a new class CustomScoreProvider to function package
+   that can be subclassed to provide custom scoring to CustomScoreQuery.
+   The methods in CustomScoreQuery that did this before were deprecated
+   and replaced by a method getCustomScoreProvider(IndexReader) that
+   returns a custom score implementation using the above class. The change
+   is necessary with per-segment searching, as CustomScoreQuery is
+   a stateless class (like all other Queries) and does not know about
+   the currently searched segment. This API works similar to Filter's
+   getDocIdSet(IndexReader).  (Paul chez Jamespot via Mike McCandless,
+   Uwe Schindler)
+
+ * LUCENE-2080: Deprecate Version.LUCENE_CURRENT, as using this constant
+   will cause backwards compatibility problems when upgrading Lucene. See
+   the Version javadocs for additional information.
+   (Robert Muir)
+
+Optimizations
+
+ * LUCENE-2086: When resolving deleted terms, do so in term sort order
+   for better performance (Bogdan Ghidireac via Mike McCandless)
+
+ * LUCENE-2258: Remove unneeded synchronization in FuzzyTermEnum.
+   (Uwe Schindler, Robert Muir)
+
+Test Cases
+
+ * LUCENE-2114: Change TestFilteredSearch to test on multi-segment
+   index as well. (Simon Willnauer via Mike McCandless)
+
+ * LUCENE-2211: Improves BaseTokenStreamTestCase to use a fake attribute
+   that checks if clearAttributes() was called correctly.
+   (Uwe Schindler, Robert Muir)
+
+ * LUCENE-2207, LUCENE-2219: Improve BaseTokenStreamTestCase to check if
+   end() is implemented correctly.  (Koji Sekiguchi, Robert Muir)
+
+Documentation
+
+ * LUCENE-2114: Improve javadocs of Filter to call out that the
+   provided reader is per-segment (Simon Willnauer via Mike
+   McCandless)
+
+======================= Release 2.9.1 2009-11-06 =======================
+
+Changes in backwards compatibility policy
+
+ * LUCENE-2002: Add required Version matchVersion argument when
+   constructing QueryParser or MultiFieldQueryParser and, default (as
+   of 2.9) enablePositionIncrements to true to match
+   StandardAnalyzer's 2.9 default (Uwe Schindler, Mike McCandless)
+
+Bug fixes
+
+ * LUCENE-1974: Fixed nasty bug in BooleanQuery (when it used
+   BooleanScorer for scoring), whereby some matching documents fail to
+   be collected.  (Fulin Tang via Mike McCandless)
+
+ * LUCENE-1124: Make sure FuzzyQuery always matches the precise term.
+   (stefatwork@gmail.com via Mike McCandless)
+
+ * LUCENE-1976: Fix IndexReader.isCurrent() to return the right thing
+   when the reader is a near real-time reader.  (Jake Mannix via Mike
+   McCandless)
+
+ * LUCENE-1986: Fix NPE when scoring PayloadNearQuery (Peter Keegan,
+   Mark Miller via Mike McCandless)
+
+ * LUCENE-1992: Fix thread hazard if a merge is committing just as an
+   exception occurs during sync (Uwe Schindler, Mike McCandless)
+
+ * LUCENE-1995: Note in javadocs that IndexWriter.setRAMBufferSizeMB
+   cannot exceed 2048 MB, and throw IllegalArgumentException if it
+   does.  (Aaron McKee, Yonik Seeley, Mike McCandless)
+
+ * LUCENE-2004: Fix Constants.LUCENE_MAIN_VERSION to not be inlined
+   by client code.  (Uwe Schindler)
+
+ * LUCENE-2016: Replace illegal U+FFFF character with the replacement
+   char (U+FFFD) during indexing, to prevent silent index corruption.
+   (Peter Keegan, Mike McCandless)
+
+API Changes
+
+ * Un-deprecate search(Weight weight, Filter filter, int n) from
+   Searchable interface (deprecated by accident).  (Uwe Schindler)
+
+ * Un-deprecate o.a.l.util.Version constants.  (Mike McCandless)
+
+ * LUCENE-1987: Un-deprecate some ctors of Token, as they will not
+   be removed in 3.0 and are still useful. Also add some missing
+   o.a.l.util.Version constants for enabling invalid acronym
+   settings in StandardAnalyzer to be compatible with the coming
+   Lucene 3.0.  (Uwe Schindler)
+
+ * LUCENE-1973: Un-deprecate IndexSearcher.setDefaultFieldSortScoring,
+   to allow controlling per-IndexSearcher whether scores are computed
+   when sorting by field.  (Uwe Schindler, Mike McCandless)
+   
+Documentation
+
+ * LUCENE-1955: Fix Hits deprecation notice to point users in right
+   direction. (Mike McCandless, Mark Miller)
+   
+ * Fix javadoc about score tracking done by search methods in Searcher 
+   and IndexSearcher.  (Mike McCandless)
+
+ * LUCENE-2008: Javadoc improvements for TokenStream/Tokenizer/Token
+   (Luke Nezda via Mike McCandless)
+
+======================= Release 2.9.0 2009-09-23 =======================
+
+Changes in backwards compatibility policy
+
+ * LUCENE-1575: Searchable.search(Weight, Filter, int, Sort) no
+    longer computes a document score for each hit by default.  If
+    document score tracking is still needed, you can call
+    IndexSearcher.setDefaultFieldSortScoring(true, true) to enable
+    both per-hit and maxScore tracking; however, this is deprecated
+    and will be removed in 3.0.
+
+    Alternatively, use Searchable.search(Weight, Filter, Collector)
+    and pass in a TopFieldCollector instance, using the following code
+    sample:
+ 
+    <code>
+      TopFieldCollector tfc = TopFieldCollector.create(sort, numHits, fillFields, 
+                                                       true /* trackDocScores */,
+                                                       true /* trackMaxScore */,
+                                                       false /* docsInOrder */);
+      searcher.search(query, tfc);
+      TopDocs results = tfc.topDocs();
+    </code>
+
+    Note that your Sort object cannot use SortField.AUTO when you
+    directly instantiate TopFieldCollector.
+
+    Also, the method search(Weight, Filter, Collector) was added to
+    the Searchable interface and the Searcher abstract class to
+    replace the deprecated HitCollector versions.  If you either
+    implement Searchable or extend Searcher, you should change your
+    code to implement this method.  If you already extend
+    IndexSearcher, no further changes are needed to use Collector.
+    
+    Finally, the values Float.NaN and Float.NEGATIVE_INFINITY are not
+    valid scores.  Lucene uses these values internally in certain
+    places, so if you have hits with such scores, it will cause
+    problems. (Shai Erera via Mike McCandless)
+
+ * LUCENE-1687: All methods and parsers from the interface ExtendedFieldCache
+    have been moved into FieldCache. ExtendedFieldCache is now deprecated and
+    contains only a few declarations for binary backwards compatibility. 
+    ExtendedFieldCache will be removed in version 3.0. Users of FieldCache and 
+    ExtendedFieldCache will be able to plug in Lucene 2.9 without recompilation.
+    The auto cache (FieldCache.getAuto) is now deprecated. Due to the merge of
+    ExtendedFieldCache and FieldCache, FieldCache can now additionally return
+    long[] and double[] arrays in addition to int[] and float[] and StringIndex.
+    
+    The interface changes are only notable for users implementing the interfaces,
+    which was unlikely done, because there is no possibility to change
+    Lucene's FieldCache implementation.  (Grant Ingersoll, Uwe Schindler)
+    
+ * LUCENE-1630, LUCENE-1771: Weight, previously an interface, is now an abstract 
+    class. Some of the method signatures have changed, but it should be fairly
+    easy to see what adjustments must be made to existing code to sync up
+    with the new API. You can find more detail in the API Changes section.
+    
+    Going forward Searchable will be kept for convenience only and may
+    be changed between minor releases without any deprecation
+    process. It is not recommended that you implement it, but rather extend
+    Searcher.  
+    (Shai Erera, Chris Hostetter, Martin Ruckli, Mark Miller via Mike McCandless)
+
+ * LUCENE-1422, LUCENE-1693: The new Attribute based TokenStream API (see below)
+    has some backwards breaks in rare cases. We did our best to make the 
+    transition as easy as possible and you are not likely to run into any problems. 
+    If your tokenizers still implement next(Token) or next(), the calls are 
+    automatically wrapped. The indexer and query parser use the new API 
+    (eg use incrementToken() calls). All core TokenStreams are implemented using 
+    the new API. You can mix old and new API style TokenFilters/TokenStream. 
+    Problems only occur when you have done the following:
+    You have overridden next(Token) or next() in one of the non-abstract core
+    TokenStreams/-Filters. These classes should normally be final, but some
+    of them are not. In this case, next(Token)/next() would never be called.
+    To fail early with a hard compile/runtime error, the next(Token)/next()
+    methods in these TokenStreams/-Filters were made final in this release.
+    (Michael Busch, Uwe Schindler)
+
+ * LUCENE-1763: MergePolicy now requires an IndexWriter instance to
+    be passed upon instantiation. As a result, IndexWriter was removed
+    as a method argument from all MergePolicy methods. (Shai Erera via
+    Mike McCandless)
+    
+ * LUCENE-1748: LUCENE-1001 introduced PayloadSpans, but this was a back
+    compat break and caused custom SpanQuery implementations to fail at runtime
+    in a variety of ways. This issue attempts to remedy things by causing
+    a compile time break on custom SpanQuery implementations and removing 
+    the PayloadSpans class, with its functionality now moved to Spans. To
+    help in alleviating future back compat pain, Spans has been changed from
+    an interface to an abstract class.
+    (Hugh Cayless, Mark Miller)
+    
+ * LUCENE-1808: Query.createWeight has been changed from protected to
+    public. This will be a back compat break if you have overridden this
+    method - but you are likely already affected by the LUCENE-1693 (make Weight 
+    abstract rather than an interface) back compat break if you have overridden 
+    Query.creatWeight, so we have taken the opportunity to make this change.
+    (Tim Smith, Shai Erera via Mark Miller)
+
+ * LUCENE-1708 - IndexReader.document() no longer checks if the document is 
+    deleted. You can call IndexReader.isDeleted(n) prior to calling document(n).
+    (Shai Erera via Mike McCandless)
+
+ 
+Changes in runtime behavior
+
+ * LUCENE-1424: QueryParser now by default uses constant score auto
+    rewriting when it generates a WildcardQuery and PrefixQuery (it
+    already does so for TermRangeQuery, as well).  Call
+    setMultiTermRewriteMethod(MultiTermQuery.SCORING_BOOLEAN_QUERY_REWRITE)
+    to revert to slower BooleanQuery rewriting method.  (Mark Miller via Mike
+    McCandless)
+    
+ * LUCENE-1575: As of 2.9, the core collectors as well as
+    IndexSearcher's search methods that return top N results, no
+    longer filter documents with scores <= 0.0. If you rely on this
+    functionality you can use PositiveScoresOnlyCollector like this:
+
+    <code>
+      TopDocsCollector tdc = new TopScoreDocCollector(10);
+      Collector c = new PositiveScoresOnlyCollector(tdc);
+      searcher.search(query, c);
+      TopDocs hits = tdc.topDocs();
+      ...
+    </code>
+
+ * LUCENE-1604: IndexReader.norms(String field) is now allowed to
+    return null if the field has no norms, as long as you've
+    previously called IndexReader.setDisableFakeNorms(true).  This
+    setting now defaults to false (to preserve the fake norms back
+    compatible behavior) but in 3.0 will be hardwired to true.  (Shon
+    Vella via Mike McCandless).
+
+ * LUCENE-1624: If you open IndexWriter with create=true and
+    autoCommit=false on an existing index, IndexWriter no longer
+    writes an empty commit when it's created.  (Paul Taylor via Mike
+    McCandless)
+
+ * LUCENE-1593: When you call Sort() or Sort.setSort(String field,
+    boolean reverse), the resulting SortField array no longer ends
+    with SortField.FIELD_DOC (it was unnecessary as Lucene breaks ties
+    internally by docID). (Shai Erera via Michael McCandless)
+
+ * LUCENE-1542: When the first token(s) have 0 position increment,
+    IndexWriter used to incorrectly record the position as -1, if no
+    payload is present, or Integer.MAX_VALUE if a payload is present.
+    This causes positional queries to fail to match.  The bug is now
+    fixed, but if your app relies on the buggy behavior then you must
+    call IndexWriter.setAllowMinus1Position().  That API is deprecated
+    so you must fix your application, and rebuild your index, to not
+    rely on this behavior by the 3.0 release of Lucene. (Jonathan
+    Mamou, Mark Miller via Mike McCandless)
+
+
+ * LUCENE-1715: Finalizers have been removed from the 4 core classes
+    that still had them, since they will cause GC to take longer, thus
+    tying up memory for longer, and at best they mask buggy app code.
+    DirectoryReader (returned from IndexReader.open) & IndexWriter
+    previously released the write lock during finalize.
+    SimpleFSDirectory.FSIndexInput closed the descriptor in its
+    finalizer, and NativeFSLock released the lock.  It's possible
+    applications will be affected by this, but only if the application
+    is failing to close reader/writers.  (Brian Groose via Mike
+    McCandless)
+
+ * LUCENE-1717: Fixed IndexWriter to account for RAM usage of
+    buffered deletions.  (Mike McCandless)
+
+ * LUCENE-1727: Ensure that fields are stored & retrieved in the
+    exact order in which they were added to the document.  This was
+    true in all Lucene releases before 2.3, but was broken in 2.3 and
+    2.4, and is now fixed in 2.9.  (Mike McCandless)
+
+ * LUCENE-1678: The addition of Analyzer.reusableTokenStream
+    accidentally broke back compatibility of external analyzers that
+    subclassed core analyzers that implemented tokenStream but not
+    reusableTokenStream.  This is now fixed, such that if
+    reusableTokenStream is invoked on such a subclass, that method
+    will forcefully fallback to tokenStream.  (Mike McCandless)
+    
+ * LUCENE-1801: Token.clear() and Token.clearNoTermBuffer() now also clear
+    startOffset, endOffset and type. This is not likely to affect any
+    Tokenizer chains, as Tokenizers normally always set these three values.
+    This change was made to be conform to the new AttributeImpl.clear() and
+    AttributeSource.clearAttributes() to work identical for Token as one for all
+    AttributeImpl and the 6 separate AttributeImpls. (Uwe Schindler, Michael Busch)
+
+ * LUCENE-1483: When searching over multiple segments, a new Scorer is now created 
+    for each segment. Searching has been telescoped out a level and IndexSearcher now
+    operates much like MultiSearcher does. The Weight is created only once for the top 
+    level Searcher, but each Scorer is passed a per-segment IndexReader. This will 
+    result in doc ids in the Scorer being internal to the per-segment IndexReader. It 
+    has always been outside of the API to count on a given IndexReader to contain every 
+    doc id in the index - and if you have been ignoring MultiSearcher in your custom code 
+    and counting on this fact, you will find your code no longer works correctly. If a 
+    custom Scorer implementation uses any caches/filters that rely on being based on the 
+    top level IndexReader, it will need to be updated to correctly use contextless 
+    caches/filters eg you can't count on the IndexReader to contain any given doc id or 
+    all of the doc ids. (Mark Miller, Mike McCandless)
+
+ * LUCENE-1846: DateTools now uses the US locale to format the numbers in its
+    date/time strings instead of the default locale. For most locales there will
+    be no change in the index format, as DateFormatSymbols is using ASCII digits.
+    The usage of the US locale is important to guarantee correct ordering of
+    generated terms.  (Uwe Schindler)
+
+ * LUCENE-1860: MultiTermQuery now defaults to
+    CONSTANT_SCORE_AUTO_REWRITE_DEFAULT rewrite method (previously it
+    was SCORING_BOOLEAN_QUERY_REWRITE).  This means that PrefixQuery
+    and WildcardQuery will now produce constant score for all matching
+    docs, equal to the boost of the query.  (Mike McCandless)
+
+API Changes
+
+ * LUCENE-1419: Add expert API to set custom indexing chain. This API is 
+   package-protected for now, so we don't have to officially support it.
+   Yet, it will give us the possibility to try out different consumers
+   in the chain. (Michael Busch)
+
+ * LUCENE-1427: DocIdSet.iterator() is now allowed to throw
+   IOException.  (Paul Elschot, Mike McCandless)
+
+ * LUCENE-1422, LUCENE-1693: New TokenStream API that uses a new class called 
+   AttributeSource instead of the Token class, which is now a utility class that
+   holds common Token attributes. All attributes that the Token class had have 
+   been moved into separate classes: TermAttribute, OffsetAttribute, 
+   PositionIncrementAttribute, PayloadAttribute, TypeAttribute and FlagsAttribute. 
+   The new API is much more flexible; it allows to combine the Attributes 
+   arbitrarily and also to define custom Attributes. The new API has the same 
+   performance as the old next(Token) approach. For conformance with this new 
+   API Tee-/SinkTokenizer was deprecated and replaced by a new TeeSinkTokenFilter. 
+   (Michael Busch, Uwe Schindler; additional contributions and bug fixes by 
+   Daniel Shane, Doron Cohen)
+
+ * LUCENE-1467: Add nextDoc() and next(int) methods to OpenBitSetIterator.
+   These methods can be used to avoid additional calls to doc(). 
+   (Michael Busch)
+
+ * LUCENE-1468: Deprecate Directory.list(), which sometimes (in
+   FSDirectory) filters out files that don't look like index files, in
+   favor of new Directory.listAll(), which does no filtering.  Also,
+   listAll() will never return null; instead, it throws an IOException
+   (or subclass).  Specifically, FSDirectory.listAll() will throw the
+   newly added NoSuchDirectoryException if the directory does not
+   exist.  (Marcel Reutegger, Mike McCandless)
+
+ * LUCENE-1546: Add IndexReader.flush(Map commitUserData), allowing
+   you to record an opaque commitUserData (maps String -> String) into
+   the commit written by IndexReader.  This matches IndexWriter's
+   commit methods.  (Jason Rutherglen via Mike McCandless)
+
+ * LUCENE-652: Added org.apache.lucene.document.CompressionTools, to
+   enable compressing & decompressing binary content, external to
+   Lucene's indexing.  Deprecated Field.Store.COMPRESS.
+
+ * LUCENE-1561: Renamed Field.omitTf to Field.omitTermFreqAndPositions
+    (Otis Gospodnetic via Mike McCandless)
+  
+ * LUCENE-1500: Added new InvalidTokenOffsetsException to Highlighter methods
+    to denote issues when offsets in TokenStream tokens exceed the length of the
+    provided text.  (Mark Harwood)
+    
+ * LUCENE-1575, LUCENE-1483: HitCollector is now deprecated in favor of 
+    a new Collector abstract class. For easy migration, people can use
+    HitCollectorWrapper which translates (wraps) HitCollector into
+    Collector. Note that this class is also deprecated and will be
+    removed when HitCollector is removed.  Also TimeLimitedCollector
+    is deprecated in favor of the new TimeLimitingCollector which
+    extends Collector.  (Shai Erera, Mark Miller, Mike McCandless)
+
+ * LUCENE-1592: The method TermsEnum.skipTo() was deprecated, because
+    it is used nowhere in core/contrib and there is only a very ineffective
+    default implementation available. If you want to position a TermEnum
+    to another Term, create a new one using IndexReader.terms(Term).
+    (Uwe Schindler)
+
+ * LUCENE-1621: MultiTermQuery.getTerm() has been deprecated as it does
+    not make sense for all subclasses of MultiTermQuery. Check individual
+    subclasses to see if they support getTerm().  (Mark Miller)
+
+ * LUCENE-1636: Make TokenFilter.input final so it's set only
+    once. (Wouter Heijke, Uwe Schindler via Mike McCandless).
+
+ * LUCENE-1658, LUCENE-1451: Renamed FSDirectory to SimpleFSDirectory
+    (but left an FSDirectory base class).  Added an FSDirectory.open
+    static method to pick a good default FSDirectory implementation
+    given the OS. FSDirectories should now be instantiated using
+    FSDirectory.open or with public constructors rather than
+    FSDirectory.getDirectory(), which has been deprecated.
+    (Michael McCandless, Uwe Schindler, yonik)
+
+ * LUCENE-1665: Deprecate SortField.AUTO, to be removed in 3.0.
+    Instead, when sorting by field, the application should explicitly
+    state the type of the field.  (Mike McCandless)
+
+ * LUCENE-1660: StopFilter, StandardAnalyzer, StopAnalyzer now
+    require up front specification of enablePositionIncrement (Mike
+    McCandless)
+
+ * LUCENE-1614: DocIdSetIterator's next() and skipTo() were deprecated in favor
+    of the new nextDoc() and advance(). The new methods return the doc Id they 
+    landed on, saving an extra call to doc() in most cases.
+    For easy migration of the code, you can change the calls to next() to 
+    nextDoc() != DocIdSetIterator.NO_MORE_DOCS and similarly for skipTo(). 
+    However it is advised that you take advantage of the returned doc ID and not 
+    call doc() following those two.
+    Also, doc() was deprecated in favor of docID(). docID() should return -1 or 
+    NO_MORE_DOCS if nextDoc/advance were not called yet, or NO_MORE_DOCS if the 
+    iterator has exhausted. Otherwise it should return the current doc ID.
+    (Shai Erera via Mike McCandless)
+
+ * LUCENE-1672: All ctors/opens and other methods using String/File to
+    specify the directory in IndexReader, IndexWriter, and IndexSearcher
+    were deprecated. You should instantiate the Directory manually before
+    and pass it to these classes (LUCENE-1451, LUCENE-1658).
+    (Uwe Schindler)
+
+ * LUCENE-1407: Move RemoteSearchable, RemoteCachingWrapperFilter out
+    of Lucene's core into new contrib/remote package.  Searchable no
+    longer extends java.rmi.Remote (Simon Willnauer via Mike
+    McCandless)
+
+ * LUCENE-1677: The global property
+    org.apache.lucene.SegmentReader.class, and
+    ReadOnlySegmentReader.class are now deprecated, to be removed in
+    3.0.  src/gcj/* has been removed. (Earwin Burrfoot via Mike
+    McCandless)
+
+ * LUCENE-1673: Deprecated NumberTools in favour of the new
+    NumericRangeQuery and its new indexing format for numeric or
+    date values.  (Uwe Schindler)
+    
+ * LUCENE-1630, LUCENE-1771: Weight is now an abstract class, and adds
+    a scorer(IndexReader, boolean /* scoreDocsInOrder */, boolean /*
+    topScorer */) method instead of scorer(IndexReader). IndexSearcher uses 
+    this method to obtain a scorer matching the capabilities of the Collector 
+    wrt orderedness of docIDs. Some Scorers (like BooleanScorer) are much more
+    efficient if out-of-order documents scoring is allowed by a Collector.  
+    Collector must now implement acceptsDocsOutOfOrder. If you write a 
+    Collector which does not care about doc ID orderness, it is recommended 
+    that you return true.  Weight has a scoresDocsOutOfOrder method, which by 
+    default returns false.  If you create a Weight which will score documents 
+    out of order if requested, you should override that method to return true. 
+    BooleanQuery's setAllowDocsOutOfOrder and getAllowDocsOutOfOrder have been 
+    deprecated as they are not needed anymore. BooleanQuery will now score docs 
+    out of order when used with a Collector that can accept docs out of order.
+    Finally, Weight#explain now takes a sub-reader and sub-docID, rather than
+    a top level reader and docID.
+    (Shai Erera, Chris Hostetter, Martin Ruckli, Mark Miller via Mike McCandless)
+ 	
+ * LUCENE-1466, LUCENE-1906: Added CharFilter and MappingCharFilter, which allows
+    chaining & mapping of characters before tokenizers run. CharStream (subclass of
+    Reader) is the base class for custom java.io.Reader's, that support offset
+    correction. Tokenizers got an additional method correctOffset() that is passed
+    down to the underlying CharStream if input is a subclass of CharStream/-Filter.
+    (Koji Sekiguchi via Mike McCandless, Uwe Schindler)
+
+ * LUCENE-1703: Add IndexWriter.waitForMerges.  (Tim Smith via Mike
+    McCandless)
+
+ * LUCENE-1625: CheckIndex's programmatic API now returns separate
+    classes detailing the status of each component in the index, and
+    includes more detailed status than previously.  (Tim Smith via
+    Mike McCandless)
+
+ * LUCENE-1713: Deprecated RangeQuery and RangeFilter and renamed to
+    TermRangeQuery and TermRangeFilter. TermRangeQuery is in constant
+    score auto rewrite mode by default. The new classes also have new
+    ctors taking field and term ranges as Strings (see also
+    LUCENE-1424).  (Uwe Schindler)
+
+ * LUCENE-1609: The termInfosIndexDivisor must now be specified
+    up-front when opening the IndexReader.  Attempts to call
+    IndexReader.setTermInfosIndexDivisor will hit an
+    UnsupportedOperationException.  This was done to enable removal of
+    all synchronization in TermInfosReader, which previously could
+    cause threads to pile up in certain cases. (Dan Rosher via Mike
+    McCandless)
+    
+ * LUCENE-1688: Deprecate static final String stop word array in and 
+    StopAnalzyer and replace it with an immutable implementation of 
+    CharArraySet.  (Simon Willnauer via Mark Miller)
+
+ * LUCENE-1742: SegmentInfos, SegmentInfo and SegmentReader have been
+    made public as expert, experimental APIs.  These APIs may suddenly
+    change from release to release (Jason Rutherglen via Mike
+    McCandless).
+    
+ * LUCENE-1754: QueryWeight.scorer() can return null if no documents
+    are going to be matched by the query. Similarly,
+    Filter.getDocIdSet() can return null if no documents are going to
+    be accepted by the Filter. Note that these 'can' return null,
+    however they don't have to and can return a Scorer/DocIdSet which
+    does not match / reject all documents.  This is already the
+    behavior of some QueryWeight/Filter implementations, and is
+    documented here just for emphasis. (Shai Erera via Mike
+    McCandless)
+
+ * LUCENE-1705: Added IndexWriter.deleteAllDocuments.  (Tim Smith via
+    Mike McCandless)
+
+ * LUCENE-1460: Changed TokenStreams/TokenFilters in contrib to
+    use the new TokenStream API. (Robert Muir, Michael Busch)
+
+ * LUCENE-1748: LUCENE-1001 introduced PayloadSpans, but this was a back
+    compat break and caused custom SpanQuery implementations to fail at runtime
+    in a variety of ways. This issue attempts to remedy things by causing
+    a compile time break on custom SpanQuery implementations and removing 
+    the PayloadSpans class, with its functionality now moved to Spans. To
+    help in alleviating future back compat pain, Spans has been changed from
+    an interface to an abstract class.
+    (Hugh Cayless, Mark Miller)
+    
+ * LUCENE-1808: Query.createWeight has been changed from protected to
+    public. (Tim Smith, Shai Erera via Mark Miller)
+
+ * LUCENE-1826: Add constructors that take AttributeSource and
+    AttributeFactory to all Tokenizer implementations.
+    (Michael Busch)
+    
+ * LUCENE-1847: Similarity#idf for both a Term and Term Collection have
+    been deprecated. New versions that return an IDFExplanation have been
+    added.  (Yasoja Seneviratne, Mike McCandless, Mark Miller)
+    
+ * LUCENE-1877: Made NativeFSLockFactory the default for
+    the new FSDirectory API (open(), FSDirectory subclass ctors).
+    All FSDirectory system properties were deprecated and all lock
+    implementations use no lock prefix if the locks are stored inside
+    the index directory. Because the deprecated String/File ctors of
+    IndexWriter and IndexReader (LUCENE-1672) and FSDirectory.getDirectory()
+    still use the old SimpleFSLockFactory and the new API
+    NativeFSLockFactory, we strongly recommend not to mix deprecated
+    and new API. (Uwe Schindler, Mike McCandless)
+
+ * LUCENE-1911: Added a new method isCacheable() to DocIdSet. This method
+    should return true, if the underlying implementation does not use disk
+    I/O and is fast enough to be directly cached by CachingWrapperFilter.
+    OpenBitSet, SortedVIntList, and DocIdBitSet are such candidates.
+    The default implementation of the abstract DocIdSet class returns false.
+    In this case, CachingWrapperFilter copies the DocIdSetIterator into an
+    OpenBitSet for caching.  (Uwe Schindler, Thomas Becker)
+
+Bug fixes
+
+ * LUCENE-1415: MultiPhraseQuery has incorrect hashCode() and equals()
+   implementation - Leads to Solr Cache misses. 
+   (Todd Feak, Mark Miller via yonik)
+
+ * LUCENE-1327: Fix TermSpans#skipTo() to behave as specified in javadocs
+   of Terms#skipTo(). (Michael Busch)
+
+ * LUCENE-1573: Do not ignore InterruptedException (caused by
+   Thread.interrupt()) nor enter deadlock/spin loop. Now, an interrupt
+   will cause a RuntimeException to be thrown.  In 3.0 we will change
+   public APIs to throw InterruptedException.  (Jeremy Volkman via
+   Mike McCandless)
+
+ * LUCENE-1590: Fixed stored-only Field instances do not change the
+   value of omitNorms, omitTermFreqAndPositions in FieldInfo; when you
+   retrieve such fields they will now have omitNorms=true and
+   omitTermFreqAndPositions=false (though these values are unused).
+   (Uwe Schindler via Mike McCandless)
+
+ * LUCENE-1587: RangeQuery#equals() could consider a RangeQuery
+   without a collator equal to one with a collator.
+   (Mark Platvoet via Mark Miller) 
+
+ * LUCENE-1600: Don't call String.intern unnecessarily in some cases
+   when loading documents from the index.  (P Eger via Mike
+   McCandless)
+
+ * LUCENE-1611: Fix case where OutOfMemoryException in IndexWriter
+   could cause "infinite merging" to happen.  (Christiaan Fluit via
+   Mike McCandless)
+
+ * LUCENE-1623: Properly handle back-compatibility of 2.3.x indexes that
+   contain field names with non-ascii characters.  (Mike Streeton via
+   Mike McCandless)
+
+ * LUCENE-1593: MultiSearcher and ParallelMultiSearcher did not break ties (in 
+   sort) by doc Id in a consistent manner (i.e., if Sort.FIELD_DOC was used vs. 
+   when it wasn't). (Shai Erera via Michael McCandless)
+
+ * LUCENE-1647: Fix case where IndexReader.undeleteAll would cause
+    the segment's deletion count to be incorrect. (Mike McCandless)
+
+ * LUCENE-1542: When the first token(s) have 0 position increment,
+    IndexWriter used to incorrectly record the position as -1, if no
+    payload is present, or Integer.MAX_VALUE if a payload is present.
+    This causes positional queries to fail to match.  The bug is now
+    fixed, but if your app relies on the buggy behavior then you must
+    call IndexWriter.setAllowMinus1Position().  That API is deprecated
+    so you must fix your application, and rebuild your index, to not
+    rely on this behavior by the 3.0 release of Lucene. (Jonathan
+    Mamou, Mark Miller via Mike McCandless)
+
+ * LUCENE-1658: Fixed MMapDirectory to correctly throw IOExceptions
+    on EOF, removed numeric overflow possibilities and added support
+    for a hack to unmap the buffers on closing IndexInput.
+    (Uwe Schindler)
+    
+ * LUCENE-1681: Fix infinite loop caused by a call to DocValues methods 
+    getMinValue, getMaxValue, getAverageValue. (Simon Willnauer via Mark Miller)
+
+ * LUCENE-1599: Add clone support for SpanQuerys. SpanRegexQuery counts
+    on this functionality and does not work correctly without it.
+    (Billow Gao, Mark Miller)
+
+ * LUCENE-1718: Fix termInfosIndexDivisor to carry over to reopened
+    readers (Mike McCandless)
+    
+ * LUCENE-1583: SpanOrQuery skipTo() doesn't always move forwards as Spans
+	documentation indicates it should.  (Moti Nisenson via Mark Miller)
+
+ * LUCENE-1566: Sun JVM Bug
+    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6478546 causes
+    invalid OutOfMemoryError when reading too many bytes at once from
+    a file on 32bit JVMs that have a large maximum heap size.  This
+    fix adds set/getReadChunkSize to FSDirectory so that large reads
+    are broken into chunks, to work around this JVM bug.  On 32bit
+    JVMs the default chunk size is 100 MB; on 64bit JVMs, which don't
+    show the bug, the default is Integer.MAX_VALUE. (Simon Willnauer
+    via Mike McCandless)
+    
+ * LUCENE-1448: Added TokenStream.end() to perform end-of-stream
+    operations (ie to return the end offset of the tokenization).  
+    This is important when multiple fields with the same name are added
+    to a document, to ensure offsets recorded in term vectors for all 
+    of the instances are correct.  
+    (Mike McCandless, Mark Miller, Michael Busch)
+
+ * LUCENE-1805: CloseableThreadLocal did not allow a null Object in get(), 
+    although it does allow it in set(Object). Fix get() to not assert the object
+    is not null. (Shai Erera via Mike McCandless)
+    
+ * LUCENE-1801: Changed all Tokenizers or TokenStreams in core/contrib)
+    that are the source of Tokens to always call
+    AttributeSource.clearAttributes() first. (Uwe Schindler)
+    
+ * LUCENE-1819: MatchAllDocsQuery.toString(field) should produce output
+    that is parsable by the QueryParser.  (John Wang, Mark Miller)
+
+ * LUCENE-1836: Fix localization bug in the new query parser and add 
+    new LocalizedTestCase as base class for localization junit tests.
+    (Robert Muir, Uwe Schindler via Michael Busch)
+
+ * LUCENE-1847: PhraseQuery/TermQuery/SpanQuery use IndexReader specific stats 
+    in their Weight#explain methods - these stats should be corpus wide.
+    (Yasoja Seneviratne, Mike McCandless, Mark Miller)
+
+ * LUCENE-1885: Fix the bug that NativeFSLock.isLocked() did not work,
+    if the lock was obtained by another NativeFSLock(Factory) instance.
+    Because of this IndexReader.isLocked() and IndexWriter.isLocked() did
+    not work correctly.  (Uwe Schindler)
+
+ * LUCENE-1899: Fix O(N^2) CPU cost when setting docIDs in order in an
+    OpenBitSet, due to an inefficiency in how the underlying storage is
+    reallocated.  (Nadav Har'El via Mike McCandless)
+
+ * LUCENE-1918: Fixed cases where a ParallelReader would
+   generate exceptions on being passed to
+   IndexWriter.addIndexes(IndexReader[]).  First case was when the
+   ParallelReader was empty.  Second case was when the ParallelReader
+   used to contain documents with TermVectors, but all such documents
+   have been deleted. (Christian Kohlschütter via Mike McCandless)
+
+New features
+
+ * LUCENE-1411: Added expert API to open an IndexWriter on a prior
+    commit, obtained from IndexReader.listCommits.  This makes it
+    possible to rollback changes to an index even after you've closed
+    the IndexWriter that made the changes, assuming you are using an
+    IndexDeletionPolicy that keeps past commits around.  This is useful
+    when building transactional support on top of Lucene.  (Mike
+    McCandless)
+
+ * LUCENE-1382: Add an optional arbitrary Map (String -> String)
+    "commitUserData" to IndexWriter.commit(), which is stored in the
+    segments file and is then retrievable via
+    IndexReader.getCommitUserData instance and static methods.
+    (Shalin Shekhar Mangar via Mike McCandless)
+
+ * LUCENE-1420: Similarity now has a computeNorm method that allows
+    custom Similarity classes to override how norm is computed.  It's
+    provided a FieldInvertState instance that contains details from
+    inverting the field.  The default impl is boost *
+    lengthNorm(numTerms), to be backwards compatible.  Also added
+    {set/get}DiscountOverlaps to DefaultSimilarity, to control whether
+    overlapping tokens (tokens with 0 position increment) should be
+    counted in lengthNorm.  (Andrzej Bialecki via Mike McCandless)
+
+ * LUCENE-1424: Moved constant score query rewrite capability into
+    MultiTermQuery, allowing TermRangeQuery, PrefixQuery and WildcardQuery
+    to switch between constant-score rewriting or BooleanQuery
+    expansion rewriting via a new setRewriteMethod method.
+    Deprecated ConstantScoreRangeQuery (Mark Miller via Mike
+    McCandless)
+
+ * LUCENE-1461: Added FieldCacheRangeFilter, a RangeFilter for
+    single-term fields that uses FieldCache to compute the filter.  If
+    your documents all have a single term for a given field, and you
+    need to create many RangeFilters with varying lower/upper bounds,
+    then this is likely a much faster way to create the filters than
+    RangeFilter.  FieldCacheRangeFilter allows ranges on all data types,
+    FieldCache supports (term ranges, byte, short, int, long, float, double).
+    However, it comes at the expense of added RAM consumption and slower
+    first-time usage due to populating the FieldCache.  It also does not
+    support collation  (Tim Sturge, Matt Ericson via Mike McCandless and
+    Uwe Schindler)
+
+ * LUCENE-1296: add protected method CachingWrapperFilter.docIdSetToCache 
+    to allow subclasses to choose which DocIdSet implementation to use
+    (Paul Elschot via Mike McCandless)
+    
+ * LUCENE-1390: Added ASCIIFoldingFilter, a Filter that converts 
+    alphabetic, numeric, and symbolic Unicode characters which are not in 
+    the first 127 ASCII characters (the "Basic Latin" Unicode block) into 
+    their ASCII equivalents, if one exists. ISOLatin1AccentFilter, which
+    handles a subset of this filter, has been deprecated.
+    (Andi Vajda, Steven Rowe via Mark Miller)
+
+ * LUCENE-1478: Added new SortField constructor allowing you to
+    specify a custom FieldCache parser to generate numeric values from
+    terms for a field.  (Uwe Schindler via Mike McCandless)
+
+ * LUCENE-1528: Add support for Ideographic Space to the queryparser.
+    (Luis Alves via Michael Busch)
+
+ * LUCENE-1487: Added FieldCacheTermsFilter, to filter by multiple
+    terms on single-valued fields.  The filter loads the FieldCache
+    for the field the first time it's called, and subsequent usage of
+    that field, even with different Terms in the filter, are fast.
+    (Tim Sturge, Shalin Shekhar Mangar via Mike McCandless).
+
+ * LUCENE-1314: Add clone(), clone(boolean readOnly) and
+    reopen(boolean readOnly) to IndexReader.  Cloning an IndexReader
+    gives you a new reader which you can make changes to (deletions,
+    norms) without affecting the original reader.  Now, with clone or
+    reopen you can change the readOnly of the original reader.  (Jason
+    Rutherglen, Mike McCandless)
+
+ * LUCENE-1506: Added FilteredDocIdSet, an abstract class which you
+    subclass to implement the "match" method to accept or reject each
+    docID.  Unlike ChainedFilter (under contrib/misc),
+    FilteredDocIdSet never requires you to materialize the full
+    bitset.  Instead, match() is called on demand per docID.  (John
+    Wang via Mike McCandless)
+
+ * LUCENE-1398: Add ReverseStringFilter to contrib/analyzers, a filter
+    to reverse the characters in each token.  (Koji Sekiguchi via yonik)
+
+ * LUCENE-1551: Add expert IndexReader.reopen(IndexCommit) to allow
+    efficiently opening a new reader on a specific commit, sharing
+    resources with the original reader.  (Torin Danil via Mike
+    McCandless)
+
+ * LUCENE-1434: Added org.apache.lucene.util.IndexableBinaryStringTools,
+    to encode byte[] as String values that are valid terms, and
+    maintain sort order of the original byte[] when the bytes are
+    interpreted as unsigned.  (Steven Rowe via Mike McCandless)
+
+ * LUCENE-1543: Allow MatchAllDocsQuery to optionally use norms from
+    a specific fields to set the score for a document.  (Karl Wettin
+    via Mike McCandless)
+
+ * LUCENE-1586: Add IndexReader.getUniqueTermCount().  (Mike
+    McCandless via Derek)
+
+ * LUCENE-1516: Added "near real-time search" to IndexWriter, via a
+    new expert getReader() method.  This method returns a reader that
+    searches the full index, including any uncommitted changes in the
+    current IndexWriter session.  This should result in a faster
+    turnaround than the normal approach of commiting the changes and
+    then reopening a reader.  (Jason Rutherglen via Mike McCandless)
+
+ * LUCENE-1603: Added new MultiTermQueryWrapperFilter, to wrap any
+    MultiTermQuery as a Filter.  Also made some improvements to
+    MultiTermQuery: return DocIdSet.EMPTY_DOCIDSET if there are no
+    terms in the enum; track the total number of terms it visited
+    during rewrite (getTotalNumberOfTerms).  FilteredTermEnum is also
+    more friendly to subclassing.  (Uwe Schindler via Mike McCandless)
+
+ * LUCENE-1605: Added BitVector.subset().  (Jeremy Volkman via Mike
+    McCandless)
+    
+ * LUCENE-1618: Added FileSwitchDirectory that enables files with
+    specified extensions to be stored in a primary directory and the
+    rest of the files to be stored in the secondary directory.  For
+    example, this can be useful for the large doc-store (stored
+    fields, term vectors) files in FSDirectory and the rest of the
+    index files in a RAMDirectory. (Jason Rutherglen via Mike
+    McCandless)
+
+ * LUCENE-1494: Added FieldMaskingSpanQuery which can be used to
+    cross-correlate Spans from different fields.
+    (Paul Cowan and Chris Hostetter)
+
+ * LUCENE-1634: Add calibrateSizeByDeletes to LogMergePolicy, to take
+    deletions into account when considering merges.  (Yasuhiro Matsuda
+    via Mike McCandless)
+
+ * LUCENE-1550: Added new n-gram based String distance measure for spell checking.
+    See the Javadocs for NGramDistance.java for a reference paper on why
+    this is helpful (Tom Morton via Grant Ingersoll)
+
+ * LUCENE-1470, LUCENE-1582, LUCENE-1602, LUCENE-1673, LUCENE-1701, LUCENE-1712:
+    Added NumericRangeQuery and NumericRangeFilter, a fast alternative to
+    RangeQuery/RangeFilter for numeric searches. They depend on a specific
+    structure of terms in the index that can be created by indexing
+    using the new NumericField or NumericTokenStream classes. NumericField
+    can only be used for indexing and optionally stores the values as
+    string representation in the doc store. Documents returned from
+    IndexReader/IndexSearcher will return only the String value using
+    the standard Fieldable interface. NumericFields can be sorted on
+    and loaded into the FieldCache.  (Uwe Schindler, Yonik Seeley,
+    Mike McCandless)
+
+ * LUCENE-1405: Added support for Ant resource collections in contrib/ant
+    <index> task.  (Przemyslaw Sztoch via Erik Hatcher)
+
+ * LUCENE-1699: Allow setting a TokenStream on Field/Fieldable for indexing
+    in conjunction with any other ways to specify stored field values,
+    currently binary or string values.  (yonik)
+    
+ * LUCENE-1701: Made the standard FieldCache.Parsers public and added
+    parsers for fields generated using NumericField/NumericTokenStream.
+    All standard parsers now also implement Serializable and enforce
+    their singleton status.  (Uwe Schindler, Mike McCandless)
+    
+ * LUCENE-1741: User configurable maximum chunk size in MMapDirectory.
+    On 32 bit platforms, the address space can be very fragmented, so
+    one big ByteBuffer for the whole file may not fit into address space.
+    (Eks Dev via Uwe Schindler)
+
+ * LUCENE-1644: Enable 4 rewrite modes for queries deriving from
+    MultiTermQuery (WildcardQuery, PrefixQuery, TermRangeQuery,
+    NumericRangeQuery): CONSTANT_SCORE_FILTER_REWRITE first creates a
+    filter and then assigns constant score (boost) to docs;
+    CONSTANT_SCORE_BOOLEAN_QUERY_REWRITE create a BooleanQuery but
+    uses a constant score (boost); SCORING_BOOLEAN_QUERY_REWRITE also
+    creates a BooleanQuery but keeps the BooleanQuery's scores;
+    CONSTANT_SCORE_AUTO_REWRITE tries to pick the most performant
+    constant-score rewrite method.  (Mike McCandless)
+    
+ * LUCENE-1448: Added TokenStream.end(), to perform end-of-stream
+    operations.  This is currently used to fix offset problems when 
+    multiple fields with the same name are added to a document.
+    (Mike McCandless, Mark Miller, Michael Busch)
+ 
+ * LUCENE-1776: Add an option to not collect payloads for an ordered
+    SpanNearQuery. Payloads were not lazily loaded in this case as
+    the javadocs implied. If you have payloads and want to use an ordered
+    SpanNearQuery that does not need to use the payloads, you can
+    disable loading them with a new constructor switch.  (Mark Miller)
+
+ * LUCENE-1341: Added PayloadNearQuery to enable SpanNearQuery functionality
+    with payloads (Peter Keegan, Grant Ingersoll, Mark Miller)
+
+ * LUCENE-1790: Added PayloadTermQuery to enable scoring of payloads
+    based on the maximum payload seen for a document.
+    Slight refactoring of Similarity and other payload queries (Grant Ingersoll, Mark Miller)
+
+ * LUCENE-1749: Addition of FieldCacheSanityChecker utility, and
+    hooks to use it in all existing Lucene Tests.  This class can
+    be used by any application to inspect the FieldCache and provide
+    diagnostic information about the possibility of inconsistent
+    FieldCache usage.  Namely: FieldCache entries for the same field
+    with different datatypes or parsers; and FieldCache entries for
+    the same field in both a reader, and one of it's (descendant) sub
+    readers. 
+    (Chris Hostetter, Mark Miller)
+
+ * LUCENE-1789: Added utility class
+    oal.search.function.MultiValueSource to ease the transition to
+    segment based searching for any apps that directly call
+    oal.search.function.* APIs.  This class wraps any other
+    ValueSource, but takes care when composite (multi-segment) are
+    passed to not double RAM usage in the FieldCache.  (Chris
+    Hostetter, Mark Miller, Mike McCandless)
+   
+Optimizations
+
+ * LUCENE-1427: Fixed QueryWrapperFilter to not waste time computing
+    scores of the query, since they are just discarded.  Also, made it
+    more efficient (single pass) by not creating & populating an
+    intermediate OpenBitSet (Paul Elschot, Mike McCandless)
+
+ * LUCENE-1443: Performance improvement for OpenBitSetDISI.inPlaceAnd()
+    (Paul Elschot via yonik)
+
+ * LUCENE-1484: Remove synchronization of IndexReader.document() by
+    using CloseableThreadLocal internally.  (Jason Rutherglen via Mike
+    McCandless).
+    
+ * LUCENE-1124: Short circuit FuzzyQuery.rewrite when input token length 
+    is small compared to minSimilarity. (Timo Nentwig, Mark Miller)
+
+ * LUCENE-1316: MatchAllDocsQuery now avoids the synchronized
+    IndexReader.isDeleted() call per document, by directly accessing
+    the underlying deleteDocs BitVector.  This improves performance
+    with non-readOnly readers, especially in a multi-threaded
+    environment.  (Todd Feak, Yonik Seeley, Jason Rutherglen via Mike
+    McCandless)
+
+ * LUCENE-1483: When searching over multiple segments we now visit
+    each sub-reader one at a time.  This speeds up warming, since
+    FieldCache entries (if required) can be shared across reopens for
+    those segments that did not change, and also speeds up searches
+    that sort by relevance or by field values.  (Mark Miller, Mike
+    McCandless)
+    
+ * LUCENE-1575: The new Collector class decouples collect() from
+    score computation.  Collector.setScorer is called to establish the
+    current Scorer in-use per segment.  Collectors that require the
+    score should then call Scorer.score() per hit inside
+    collect(). (Shai Erera via Mike McCandless)
+
+ * LUCENE-1596: MultiTermDocs speedup when set with
+    MultiTermDocs.seek(MultiTermEnum) (yonik)
+    
+ * LUCENE-1653: Avoid creating a Calendar in every call to 
+    DateTools#dateToString, DateTools#timeToString and
+    DateTools#round.  (Shai Erera via Mark Miller)
+    
+ * LUCENE-1688: Deprecate static final String stop word array and 
+    replace it with an immutable implementation of CharArraySet.
+    Removes conversions between Set and array.
+    (Simon Willnauer via Mark Miller)
+
+ * LUCENE-1754: BooleanQuery.queryWeight.scorer() will return null if
+    it won't match any documents (e.g. if there are no required and
+    optional scorers, or not enough optional scorers to satisfy
+    minShouldMatch).  (Shai Erera via Mike McCandless)
+
+ * LUCENE-1607: To speed up string interning for commonly used
+    strings, the StringHelper.intern() interface was added with a
+    default implementation that uses a lockless cache.
+    (Earwin Burrfoot, yonik)
+
+ * LUCENE-1800: QueryParser should use reusable TokenStreams. (yonik)
+    
+
+Documentation
+
+ * LUCENE-1908: Scoring documentation imrovements in Similarity javadocs. 
+   (Mark Miller, Shai Erera, Ted Dunning, Jiri Kuhn, Marvin Humphrey, Doron Cohen)
+    
+ * LUCENE-1872: NumericField javadoc improvements
+    (Michael McCandless, Uwe Schindler)
+ 
+ * LUCENE-1875: Make TokenStream.end javadoc less confusing.
+    (Uwe Schindler)
+
+ * LUCENE-1862: Rectified duplicate package level javadocs for
+    o.a.l.queryParser and o.a.l.analysis.cn.
+    (Chris Hostetter)
+
+ * LUCENE-1886: Improved hyperlinking in key Analysis javadocs
+    (Bernd Fondermann via Chris Hostetter)
+
+ * LUCENE-1884: massive javadoc and comment cleanup, primarily dealing with
+    typos.
+    (Robert Muir via Chris Hostetter)
+    
+ * LUCENE-1898: Switch changes to use bullets rather than numbers and 
+    update changes-to-html script to handle the new format. 
+    (Steven Rowe, Mark Miller)
+    
+ * LUCENE-1900: Improve Searchable Javadoc.
+    (Nadav Har'El, Doron Cohen, Marvin Humphrey, Mark Miller)
+    
+ * LUCENE-1896: Improve Similarity#queryNorm javadocs.
+    (Jiri Kuhn, Mark Miller)
+
+Build
+
+ * LUCENE-1440: Add new targets to build.xml that allow downloading
+    and executing the junit testcases from an older release for
+    backwards-compatibility testing. (Michael Busch)
+
+ * LUCENE-1446: Add compatibility tag to common-build.xml and run 
+    backwards-compatibility tests in the nightly build. (Michael Busch)
+
+ * LUCENE-1529: Properly test "drop-in" replacement of jar with 
+    backwards-compatibility tests. (Mike McCandless, Michael Busch)
+
+ * LUCENE-1851: Change 'javacc' and 'clean-javacc' targets to build
+    and clean contrib/surround files. (Luis Alves via Michael Busch)
+
+ * LUCENE-1854: tar task should use longfile="gnu" to avoid false file
+    name length warnings.  (Mark Miller)
+
+Test Cases
+
+ * LUCENE-1791: Enhancements to the QueryUtils and CheckHits utility 
+    classes to wrap IndexReaders and Searchers in MultiReaders or 
+    MultiSearcher when possible to help exercise more edge cases.
+    (Chris Hostetter, Mark Miller)
+
+ * LUCENE-1852: Fix localization test failures. 
+    (Robert Muir via Michael Busch)
+    
+ * LUCENE-1843: Refactored all tests that use assertAnalyzesTo() & others
+    in core and contrib to use a new BaseTokenStreamTestCase
+    base class. Also rewrote some tests to use this general analysis assert
+    functions instead of own ones (e.g. TestMappingCharFilter).
+    The new base class also tests tokenization with the TokenStream.next()
+    backwards layer enabled (using Token/TokenWrapper as attribute
+    implementation) and disabled (default for Lucene 3.0)
+    (Uwe Schindler, Robert Muir)
+    
+ * LUCENE-1836: Added a new LocalizedTestCase as base class for localization
+    junit tests.  (Robert Muir, Uwe Schindler via Michael Busch)
+
+======================= Release 2.4.1 2009-03-09 =======================
+
+API Changes
+
+1. LUCENE-1186: Add Analyzer.close() to free internal ThreadLocal
+   resources.  (Christian Kohlschütter via Mike McCandless)
+
+Bug fixes
+
+1. LUCENE-1452: Fixed silent data-loss case whereby binary fields are
+   truncated to 0 bytes during merging if the segments being merged
+   are non-congruent (same field name maps to different field
+   numbers).  This bug was introduced with LUCENE-1219.  (Andrzej
+   Bialecki via Mike McCandless).
+
+2. LUCENE-1429: Don't throw incorrect IllegalStateException from
+   IndexWriter.close() if you've hit an OOM when autoCommit is true.
+   (Mike McCandless)
+
+3. LUCENE-1474: If IndexReader.flush() is called twice when there were
+   pending deletions, it could lead to later false AssertionError
+   during IndexReader.open.  (Mike McCandless)
+
+4. LUCENE-1430: Fix false AlreadyClosedException from IndexReader.open
+   (masking an actual IOException) that takes String or File path.
+   (Mike McCandless)
+
+5. LUCENE-1442: Multiple-valued NOT_ANALYZED fields can double-count
+   token offsets.  (Mike McCandless)
+
+6. LUCENE-1453: Ensure IndexReader.reopen()/clone() does not result in
+   incorrectly closing the shared FSDirectory. This bug would only
+   happen if you use IndexReader.open() with a File or String argument.
+   The returned readers are wrapped by a FilterIndexReader that
+   correctly handles closing of directory after reopen()/clone(). 
+   (Mark Miller, Uwe Schindler, Mike McCandless)
+
+7. LUCENE-1457: Fix possible overflow bugs during binary
+   searches. (Mark Miller via Mike McCandless)
+
+8. LUCENE-1459: Fix CachingWrapperFilter to not throw exception if
+   both bits() and getDocIdSet() methods are called. (Matt Jones via
+   Mike McCandless)
+
+9. LUCENE-1519: Fix int overflow bug during segment merging.  (Deepak
+   via Mike McCandless)
+
+10. LUCENE-1521: Fix int overflow bug when flushing segment.
+    (Shon Vella via Mike McCandless).
+
+11. LUCENE-1544: Fix deadlock in IndexWriter.addIndexes(IndexReader[]).
+    (Mike McCandless via Doug Sale)
+
+12. LUCENE-1547: Fix rare thread safety issue if two threads call
+    IndexWriter commit() at the same time.  (Mike McCandless)
+
+13. LUCENE-1465: NearSpansOrdered returns payloads from first possible match 
+    rather than the correct, shortest match; Payloads could be returned even
+    if the max slop was exceeded; The wrong payload could be returned in 
+    certain situations. (Jonathan Mamou, Greg Shackles, Mark Miller)
+
+14. LUCENE-1186: Add Analyzer.close() to free internal ThreadLocal
+    resources.  (Christian Kohlschütter via Mike McCandless)
+
+15. LUCENE-1552: Fix IndexWriter.addIndexes(IndexReader[]) to properly
+    rollback IndexWriter's internal state on hitting an
+    exception. (Scott Garland via Mike McCandless)
+
+======================= Release 2.4.0 2008-10-06 =======================
+
+Changes in backwards compatibility policy
+
+1. LUCENE-1340: In a minor change to Lucene's backward compatibility
+   policy, we are now allowing the Fieldable interface to have
+   changes, within reason, and made on a case-by-case basis.  If an
+   application implements it's own Fieldable, please be aware of
+   this.  Otherwise, no need to be concerned.  This is in effect for
+   all 2.X releases, starting with 2.4.  Also note, that in all
+   likelihood, Fieldable will be changed in 3.0.
+
+
+Changes in runtime behavior
+
+ 1. LUCENE-1151: Fix StandardAnalyzer to not mis-identify host names
+    (eg lucene.apache.org) as an ACRONYM.  To get back to the pre-2.4
+    backwards compatible, but buggy, behavior, you can either call
+    StandardAnalyzer.setDefaultReplaceInvalidAcronym(false) (static
+    method), or, set system property
+    org.apache.lucene.analysis.standard.StandardAnalyzer.replaceInvalidAcronym
+    to "false" on JVM startup.  All StandardAnalyzer instances created
+    after that will then show the pre-2.4 behavior.  Alternatively,
+    you can call setReplaceInvalidAcronym(false) to change the
+    behavior per instance of StandardAnalyzer.  This backwards
+    compatibility will be removed in 3.0 (hardwiring the value to
+    true).  (Mike McCandless)
+
+ 2. LUCENE-1044: IndexWriter with autoCommit=true now commits (such
+    that a reader can see the changes) far less often than it used to.
+    Previously, every flush was also a commit.  You can always force a
+    commit by calling IndexWriter.commit().  Furthermore, in 3.0,
+    autoCommit will be hardwired to false (IndexWriter constructors
+    that take an autoCommit argument have been deprecated) (Mike
+    McCandless)
+
+ 3. LUCENE-1335: IndexWriter.addIndexes(Directory[]) and
+    addIndexesNoOptimize no longer allow the same Directory instance
+    to be passed in more than once.  Internally, IndexWriter uses
+    Directory and segment name to uniquely identify segments, so
+    adding the same Directory more than once was causing duplicates
+    which led to problems (Mike McCandless)
+
+ 4. LUCENE-1396: Improve PhraseQuery.toString() so that gaps in the
+    positions are indicated with a ? and multiple terms at the same
+    position are joined with a |.  (Andrzej Bialecki via Mike
+    McCandless)
+
+API Changes
+
+ 1. LUCENE-1084: Changed all IndexWriter constructors to take an
+    explicit parameter for maximum field size.  Deprecated all the
+    pre-existing constructors; these will be removed in release 3.0.
+    NOTE: these new constructors set autoCommit to false.  (Steven
+    Rowe via Mike McCandless)
+
+ 2. LUCENE-584: Changed Filter API to return a DocIdSet instead of a
+    java.util.BitSet. This allows using more efficient data structures
+    for Filters and makes them more flexible. This deprecates
+    Filter.bits(), so all filters that implement this outside
+    the Lucene code base will need to be adapted. See also the javadocs
+    of the Filter class. (Paul Elschot, Michael Busch)
+
+ 3. LUCENE-1044: Added IndexWriter.commit() which flushes any buffered
+    adds/deletes and then commits a new segments file so readers will
+    see the changes.  Deprecate IndexWriter.flush() in favor of
+    IndexWriter.commit().  (Mike McCandless)
+
+ 4. LUCENE-325: Added IndexWriter.expungeDeletes methods, which
+    consult the MergePolicy to find merges necessary to merge away all
+    deletes from the index.  This should be a somewhat lower cost
+    operation than optimize.  (John Wang via Mike McCandless)
+
+ 5. LUCENE-1233: Return empty array instead of null when no fields
+    match the specified name in these methods in Document:
+    getFieldables, getFields, getValues, getBinaryValues.  (Stefan
+    Trcek vai Mike McCandless)
+
+ 6. LUCENE-1234: Make BoostingSpanScorer protected.  (Andi Vajda via Grant Ingersoll)
+
+ 7. LUCENE-510: The index now stores strings as true UTF-8 bytes
+    (previously it was Java's modified UTF-8).  If any text, either
+    stored fields or a token, has illegal UTF-16 surrogate characters,
+    these characters are now silently replaced with the Unicode
+    replacement character U+FFFD.  This is a change to the index file
+    format.  (Marvin Humphrey via Mike McCandless)
+
+ 8. LUCENE-852: Let the SpellChecker caller specify IndexWriter mergeFactor
+    and RAM buffer size.  (Otis Gospodnetic)
+	
+ 9. LUCENE-1290: Deprecate org.apache.lucene.search.Hits, Hit and HitIterator
+    and remove all references to these classes from the core. Also update demos
+    and tutorials. (Michael Busch)
+
+10. LUCENE-1288: Add getVersion() and getGeneration() to IndexCommit.
+    getVersion() returns the same value that IndexReader.getVersion()
+    returns when the reader is opened on the same commit.  (Jason
+    Rutherglen via Mike McCandless)
+
+11. LUCENE-1311: Added IndexReader.listCommits(Directory) static
+    method to list all commits in a Directory, plus IndexReader.open
+    methods that accept an IndexCommit and open the index as of that
+    commit.  These methods are only useful if you implement a custom
+    DeletionPolicy that keeps more than the last commit around.
+    (Jason Rutherglen via Mike McCandless)
+
+12. LUCENE-1325: Added IndexCommit.isOptimized().  (Shalin Shekhar
+    Mangar via Mike McCandless)
+
+13. LUCENE-1324: Added TokenFilter.reset(). (Shai Erera via Mike
+    McCandless)
+
+14. LUCENE-1340: Added Fieldable.omitTf() method to skip indexing term
+    frequency, positions and payloads.  This saves index space, and
+    indexing/searching time.  (Eks Dev via Mike McCandless)
+
+15. LUCENE-1219: Add basic reuse API to Fieldable for binary fields:
+    getBinaryValue/Offset/Length(); currently only lazy fields reuse
+    the provided byte[] result to getBinaryValue.  (Eks Dev via Mike
+    McCandless)
+
+16. LUCENE-1334: Add new constructor for Term: Term(String fieldName)
+    which defaults term text to "".  (DM Smith via Mike McCandless)
+
+17. LUCENE-1333: Added Token.reinit(*) APIs to re-initialize (reuse) a
+    Token.  Also added term() method to return a String, with a
+    performance penalty clearly documented.  Also implemented
+    hashCode() and equals() in Token, and fixed all core and contrib
+    analyzers to use the re-use APIs.  (DM Smith via Mike McCandless)
+
+18. LUCENE-1329: Add optional readOnly boolean when opening an
+    IndexReader.  A readOnly reader is not allowed to make changes
+    (deletions, norms) to the index; in exchanged, the isDeleted
+    method, often a bottleneck when searching with many threads, is
+    not synchronized.  The default for readOnly is still false, but in
+    3.0 the default will become true.  (Jason Rutherglen via Mike
+    McCandless)
+
+19. LUCENE-1367: Add IndexCommit.isDeleted().  (Shalin Shekhar Mangar
+    via Mike McCandless)
+
+20. LUCENE-1061: Factored out all "new XXXQuery(...)" in
+    QueryParser.java into protected methods newXXXQuery(...) so that
+    subclasses can create their own subclasses of each Query type.
+    (John Wang via Mike McCandless)
+
+21. LUCENE-753: Added new Directory implementation
+    org.apache.lucene.store.NIOFSDirectory, which uses java.nio's
+    FileChannel to do file reads.  On most non-Windows platforms, with
+    many threads sharing a single searcher, this may yield sizable
+    improvement to query throughput when compared to FSDirectory,
+    which only allows a single thread to read from an open file at a
+    time.  (Jason Rutherglen via Mike McCandless)
+
+22. LUCENE-1371: Added convenience method TopDocs Searcher.search(Query query, int n).
+    (Mike McCandless)
+    
+23. LUCENE-1356: Allow easy extensions of TopDocCollector by turning
+    constructor and fields from package to protected. (Shai Erera
+    via Doron Cohen) 
+
+24. LUCENE-1375: Added convenience method IndexCommit.getTimestamp,
+    which is equivalent to
+    getDirectory().fileModified(getSegmentsFileName()).  (Mike McCandless)
+
+23. LUCENE-1366: Rename Field.Index options to be more accurate:
+    TOKENIZED becomes ANALYZED;  UN_TOKENIZED becomes NOT_ANALYZED;
+    NO_NORMS becomes NOT_ANALYZED_NO_NORMS and a new ANALYZED_NO_NORMS
+    is added.  (Mike McCandless)
+
+24. LUCENE-1131: Added numDeletedDocs method to IndexReader (Otis Gospodnetic)
+
+Bug fixes
+    
+ 1. LUCENE-1134: Fixed BooleanQuery.rewrite to only optimize a single 
+    clause query if minNumShouldMatch<=0. (Shai Erera via Michael Busch)
+
+ 2. LUCENE-1169: Fixed bug in IndexSearcher.search(): searching with
+    a filter might miss some hits because scorer.skipTo() is called
+    without checking if the scorer is already at the right position.
+    scorer.skipTo(scorer.doc()) is not a NOOP, it behaves as 
+    scorer.next(). (Eks Dev, Michael Busch)
+
+ 3. LUCENE-1182: Added scorePayload to SimilarityDelegator (Andi Vajda via Grant Ingersoll)
+ 
+ 4. LUCENE-1213: MultiFieldQueryParser was ignoring slop in case
+    of a single field phrase. (Trejkaz via Doron Cohen)
+
+ 5. LUCENE-1228: IndexWriter.commit() was not updating the index version and as
+    result IndexReader.reopen() failed to sense index changes. (Doron Cohen)
+
+ 6. LUCENE-1267: Added numDocs() and maxDoc() to IndexWriter;
+    deprecated docCount().  (Mike McCandless)
+
+ 7. LUCENE-1274: Added new prepareCommit() method to IndexWriter,
+    which does phase 1 of a 2-phase commit (commit() does phase 2).
+    This is needed when you want to update an index as part of a
+    transaction involving external resources (eg a database).  Also
+    deprecated abort(), renaming it to rollback().  (Mike McCandless)
+
+ 8. LUCENE-1003: Stop RussianAnalyzer from removing numbers.
+    (TUSUR OpenTeam, Dmitry Lihachev via Otis Gospodnetic)
+
+ 9. LUCENE-1152: SpellChecker fix around clearIndex and indexDictionary
+    methods, plus removal of IndexReader reference.
+    (Naveen Belkale via Otis Gospodnetic)
+
+10. LUCENE-1046: Removed dead code in SpellChecker
+    (Daniel Naber via Otis Gospodnetic)
+	
+11. LUCENE-1189: Fixed the QueryParser to handle escaped characters within 
+    quoted terms correctly. (Tomer Gabel via Michael Busch)
+
+12. LUCENE-1299: Fixed NPE in SpellChecker when IndexReader is not null and field is (Grant Ingersoll)
+
+13. LUCENE-1303: Fixed BoostingTermQuery's explanation to be marked as a Match 
+    depending only upon the non-payload score part, regardless of the effect of 
+    the payload on the score. Prior to this, score of a query containing a BTQ 
+    differed from its explanation. (Doron Cohen)
+    
+14. LUCENE-1310: Fixed SloppyPhraseScorer to work also for terms repeating more 
+    than twice in the query. (Doron Cohen)
+
+15. LUCENE-1351: ISOLatin1AccentFilter now cleans additional ligatures (Cedrik Lime via Grant Ingersoll)
+
+16. LUCENE-1383: Workaround a nasty "leak" in Java's builtin
+    ThreadLocal, to prevent Lucene from causing unexpected
+    OutOfMemoryError in certain situations (notably J2EE
+    applications).  (Chris Lu via Mike McCandless)
+
+New features
+
+ 1. LUCENE-1137: Added Token.set/getFlags() accessors for passing more information about a Token through the analysis
+    process.  The flag is not indexed/stored and is thus only used by analysis.
+
+ 2. LUCENE-1147: Add -segment option to CheckIndex tool so you can
+    check only a specific segment or segments in your index.  (Mike
+    McCandless)
+
+ 3. LUCENE-1045: Reopened this issue to add support for short and bytes. 
+ 
+ 4. LUCENE-584: Added new data structures to o.a.l.util, such as 
+    OpenBitSet and SortedVIntList. These extend DocIdSet and can 
+    directly be used for Filters with the new Filter API. Also changed
+    the core Filters to use OpenBitSet instead of java.util.BitSet.
+    (Paul Elschot, Michael Busch)
+
+ 5. LUCENE-494: Added QueryAutoStopWordAnalyzer to allow for the automatic removal, from a query of frequently occurring terms.
+    This Analyzer is not intended for use during indexing. (Mark Harwood via Grant Ingersoll)
+
+ 6. LUCENE-1044: Change Lucene to properly "sync" files after
+    committing, to ensure on a machine or OS crash or power cut, even
+    with cached writes, the index remains consistent.  Also added
+    explicit commit() method to IndexWriter to force a commit without
+    having to close.  (Mike McCandless)
+    
+ 7. LUCENE-997: Add search timeout (partial) support.
+    A TimeLimitedCollector was added to allow limiting search time.
+    It is a partial solution since timeout is checked only when 
+    collecting a hit, and therefore a search for rare words in a 
+    huge index might not stop within the specified time.
+    (Sean Timm via Doron Cohen) 
+
+ 8. LUCENE-1184: Allow SnapshotDeletionPolicy to be re-used across
+    close/re-open of IndexWriter while still protecting an open
+    snapshot (Tim Brennan via Mike McCandless)
+
+ 9. LUCENE-1194: Added IndexWriter.deleteDocuments(Query) to delete
+    documents matching the specified query.  Also added static unlock
+    and isLocked methods (deprecating the ones in IndexReader).  (Mike
+    McCandless)
+
+10. LUCENE-1201: Add IndexReader.getIndexCommit() method. (Tim Brennan
+    via Mike McCandless)
+
+11. LUCENE-550:  Added InstantiatedIndex implementation.  Experimental 
+    Index store similar to MemoryIndex but allows for multiple documents 
+    in memory.  (Karl Wettin via Grant Ingersoll)
+
+12. LUCENE-400: Added word based n-gram filter (in contrib/analyzers) called ShingleFilter and an Analyzer wrapper
+    that wraps another Analyzer's token stream with a ShingleFilter (Sebastian Kirsch, Steve Rowe via Grant Ingersoll) 
+
+13. LUCENE-1166: Decomposition tokenfilter for languages like German and Swedish (Thomas Peuss via Grant Ingersoll)
+
+14. LUCENE-1187: ChainedFilter and BooleanFilter now work with new Filter API
+    and DocIdSetIterator-based filters. Backwards-compatibility with old 
+    BitSet-based filters is ensured. (Paul Elschot via Michael Busch)
+
+15. LUCENE-1295: Added new method to MoreLikeThis for retrieving interesting terms and made retrieveTerms(int) public. (Grant Ingersoll)
+
+16. LUCENE-1298: MoreLikeThis can now accept a custom Similarity (Grant Ingersoll)
+
+17. LUCENE-1297: Allow other string distance measures for the SpellChecker
+    (Thomas Morton via Otis Gospodnetic)
+
+18. LUCENE-1001: Provide access to Payloads via Spans.  All existing Span Query implementations in Lucene implement. (Mark Miller, Grant Ingersoll)
+
+19. LUCENE-1354: Provide programmatic access to CheckIndex (Grant Ingersoll, Mike McCandless)
+
+20. LUCENE-1279: Add support for Collators to RangeFilter/Query and Query Parser.  (Steve Rowe via Grant Ingersoll) 
+
+Optimizations
+
+ 1. LUCENE-705: When building a compound file, use
+    RandomAccessFile.setLength() to tell the OS/filesystem to
+    pre-allocate space for the file.  This may improve fragmentation
+    in how the CFS file is stored, and allows us to detect an upcoming
+    disk full situation before actually filling up the disk.  (Mike
+    McCandless)
+
+ 2. LUCENE-1120: Speed up merging of term vectors by bulk-copying the
+    raw bytes for each contiguous range of non-deleted documents.
+    (Mike McCandless)
+	
+ 3. LUCENE-1185: Avoid checking if the TermBuffer 'scratch' in 
+    SegmentTermEnum is null for every call of scanTo().
+    (Christian Kohlschuetter via Michael Busch)
+
+ 4. LUCENE-1217: Internal to Field.java, use isBinary instead of
+    runtime type checking for possible speedup of binaryValue().
+    (Eks Dev via Mike McCandless)
+
+ 5. LUCENE-1183: Optimized TRStringDistance class (in contrib/spell) that uses
+    less memory than the previous version.  (Cédrik LIME via Otis Gospodnetic)
+
+ 6. LUCENE-1195: Improve term lookup performance by adding a LRU cache to the
+    TermInfosReader. In performance experiments the speedup was about 25% on 
+    average on mid-size indexes with ~500,000 documents for queries with 3 
+    terms and about 7% on larger indexes with ~4.3M documents. (Michael Busch)
+
+Documentation
+
+  1. LUCENE-1236:  Added some clarifying remarks to EdgeNGram*.java (Hiroaki Kawai via Grant Ingersoll)
+  
+  2. LUCENE-1157 and LUCENE-1256: HTML changes log, created automatically 
+     from CHANGES.txt. This HTML file is currently visible only via developers page.     
+     (Steven Rowe via Doron Cohen)
+
+  3. LUCENE-1349: Fieldable can now be changed without breaking backward compatibility rules (within reason.  See the note at
+  the top of this file and also on Fieldable.java).  (Grant Ingersoll)
+  
+  4. LUCENE-1873: Update documentation to reflect current Contrib area status.
+     (Steven Rowe, Mark Miller)
+
+Build
+
+  1. LUCENE-1153: Added JUnit JAR to new lib directory.  Updated build to rely on local JUnit instead of ANT/lib.
+  
+  2. LUCENE-1202: Small fixes to the way Clover is used to work better
+     with contribs.  Of particular note: a single clover db is used
+     regardless of whether tests are run globally or in the specific
+     contrib directories. 
+     
+  3. LUCENE-1353: Javacc target in contrib/miscellaneous for 
+     generating the precedence query parser. 
+
+Test Cases
+
+ 1. LUCENE-1238: Fixed intermittent failures of TestTimeLimitedCollector.testTimeoutMultiThreaded.
+    Within this fix, "greedy" flag was added to TimeLimitedCollector, to allow the wrapped 
+    collector to collect also the last doc, after allowed-tTime passed. (Doron Cohen)   
+	
+ 2. LUCENE-1348: relax TestTimeLimitedCollector to not fail due to 
+    timeout exceeded (just because test machine is very busy).
+	
+======================= Release 2.3.2 2008-05-05 =======================
+
+Bug fixes
+
+ 1. LUCENE-1191: On hitting OutOfMemoryError in any index-modifying
+    methods in IndexWriter, do not commit any further changes to the
+    index to prevent risk of possible corruption.  (Mike McCandless)
+
+ 2. LUCENE-1197: Fixed issue whereby IndexWriter would flush by RAM
+    too early when TermVectors were in use.  (Mike McCandless)
+
+ 3. LUCENE-1198: Don't corrupt index if an exception happens inside
+    DocumentsWriter.init (Mike McCandless)
+
+ 4. LUCENE-1199: Added defensive check for null indexReader before
+    calling close in IndexModifier.close() (Mike McCandless)
+
+ 5. LUCENE-1200: Fix rare deadlock case in addIndexes* when
+    ConcurrentMergeScheduler is in use (Mike McCandless)
+
+ 6. LUCENE-1208: Fix deadlock case on hitting an exception while
+    processing a document that had triggered a flush (Mike McCandless)
+
+ 7. LUCENE-1210: Fix deadlock case on hitting an exception while
+    starting a merge when using ConcurrentMergeScheduler (Mike McCandless)
+
+ 8. LUCENE-1222: Fix IndexWriter.doAfterFlush to always be called on
+    flush (Mark Ferguson via Mike McCandless)
+	
+ 9. LUCENE-1226: Fixed IndexWriter.addIndexes(IndexReader[]) to commit
+    successfully created compound files. (Michael Busch)
+
+10. LUCENE-1150: Re-expose StandardTokenizer's constants publicly;
+    this was accidentally lost with LUCENE-966.  (Nicolas Lalevée via
+    Mike McCandless)
+
+11. LUCENE-1262: Fixed bug in BufferedIndexReader.refill whereby on
+    hitting an exception in readInternal, the buffer is incorrectly
+    filled with stale bytes such that subsequent calls to readByte()
+    return incorrect results.  (Trejkaz via Mike McCandless)
+
+12. LUCENE-1270: Fixed intermittent case where IndexWriter.close()
+    would hang after IndexWriter.addIndexesNoOptimize had been
+    called.  (Stu Hood via Mike McCandless)
+	
+Build
+
+ 1. LUCENE-1230: Include *pom.xml* in source release files. (Michael Busch)
+
+ 
+======================= Release 2.3.1 2008-02-22 =======================
+
+Bug fixes
+    
+ 1. LUCENE-1168: Fixed corruption cases when autoCommit=false and
+    documents have mixed term vectors (Suresh Guvvala via Mike
+    McCandless).
+
+ 2. LUCENE-1171: Fixed some cases where OOM errors could cause
+    deadlock in IndexWriter (Mike McCandless).
+
+ 3. LUCENE-1173: Fixed corruption case when autoCommit=false and bulk
+    merging of stored fields is used (Yonik via Mike McCandless).
+
+ 4. LUCENE-1163: Fixed bug in CharArraySet.contains(char[] buffer, int
+    offset, int len) that was ignoring offset and thus giving the
+    wrong answer.  (Thomas Peuss via Mike McCandless)
+	
+ 5. LUCENE-1177: Fix rare case where IndexWriter.optimize might do too
+    many merges at the end.  (Mike McCandless)
+	
+ 6. LUCENE-1176: Fix corruption case when documents with no term
+    vector fields are added before documents with term vector fields.
+    (Mike McCandless)
+	
+ 7. LUCENE-1179: Fixed assert statement that was incorrectly
+    preventing Fields with empty-string field name from working.
+    (Sergey Kabashnyuk via Mike McCandless)
+
+======================= Release 2.3.0 2008-01-21 =======================
+
+Changes in runtime behavior
+
+ 1. LUCENE-994: Defaults for IndexWriter have been changed to maximize
+    out-of-the-box indexing speed.  First, IndexWriter now flushes by
+    RAM usage (16 MB by default) instead of a fixed doc count (call
+    IndexWriter.setMaxBufferedDocs to get backwards compatible
+    behavior).  Second, ConcurrentMergeScheduler is used to run merges
+    using background threads (call IndexWriter.setMergeScheduler(new
+    SerialMergeScheduler()) to get backwards compatible behavior).
+    Third, merges are chosen based on size in bytes of each segment
+    rather than document count of each segment (call
+    IndexWriter.setMergePolicy(new LogDocMergePolicy()) to get
+    backwards compatible behavior).
+
+    NOTE: users of ParallelReader must change back all of these
+    defaults in order to ensure the docIDs "align" across all parallel
+    indices.
+
+    (Mike McCandless)
+
+ 2. LUCENE-1045: SortField.AUTO didn't work with long. When detecting
+    the field type for sorting automatically, numbers used to be
+    interpreted as int, then as float, if parsing the number as an int
+    failed. Now the detection checks for int, then for long,
+    then for float. (Daniel Naber)
+
+API Changes
+
+ 1. LUCENE-843: Added IndexWriter.setRAMBufferSizeMB(...) to have
+    IndexWriter flush whenever the buffered documents are using more
+    than the specified amount of RAM.  Also added new APIs to Token
+    that allow one to set a char[] plus offset and length to specify a
+    token (to avoid creating a new String() for each Token).  (Mike
+    McCandless)
+
+ 2. LUCENE-963: Add setters to Field to allow for re-using a single
+    Field instance during indexing.  This is a sizable performance
+    gain, especially for small documents.  (Mike McCandless)
+
+ 3. LUCENE-969: Add new APIs to Token, TokenStream and Analyzer to
+    permit re-using of Token and TokenStream instances during
+    indexing.  Changed Token to use a char[] as the store for the
+    termText instead of String.  This gives faster tokenization
+    performance (~10-15%).  (Mike McCandless)
+
+ 4. LUCENE-847: Factored MergePolicy, which determines which merges
+    should take place and when, as well as MergeScheduler, which
+    determines when the selected merges should actually run, out of
+    IndexWriter.  The default merge policy is now
+    LogByteSizeMergePolicy (see LUCENE-845) and the default merge
+    scheduler is now ConcurrentMergeScheduler (see
+    LUCENE-870). (Steven Parkes via Mike McCandless)
+
+ 5. LUCENE-1052: Add IndexReader.setTermInfosIndexDivisor(int) method
+    that allows you to reduce memory usage of the termInfos by further
+    sub-sampling (over the termIndexInterval that was used during
+    indexing) which terms are loaded into memory.  (Chuck Williams,
+    Doug Cutting via Mike McCandless)
+    
+ 6. LUCENE-743: Add IndexReader.reopen() method that re-opens an
+    existing IndexReader (see New features -> 8.) (Michael Busch)
+
+ 7. LUCENE-1062: Add setData(byte[] data), 
+    setData(byte[] data, int offset, int length), getData(), getOffset()
+    and clone() methods to o.a.l.index.Payload. Also add the field name 
+    as arg to Similarity.scorePayload(). (Michael Busch)
+
+ 8. LUCENE-982: Add IndexWriter.optimize(int maxNumSegments) method to
+    "partially optimize" an index down to maxNumSegments segments.
+    (Mike McCandless)
+
+ 9. LUCENE-1080: Changed Token.DEFAULT_TYPE to be public.
+
+10. LUCENE-1064: Changed TopDocs constructor to be public. 
+     (Shai Erera via Michael Busch)
+
+11. LUCENE-1079: DocValues cleanup: constructor now has no params,
+    and getInnerArray() now throws UnsupportedOperationException (Doron Cohen)
+
+12. LUCENE-1089: Added PriorityQueue.insertWithOverflow, which returns
+    the Object (if any) that was bumped from the queue to allow
+    re-use.  (Shai Erera via Mike McCandless)
+    
+13. LUCENE-1101: Token reuse 'contract' (defined LUCENE-969)
+    modified so it is token producer's responsibility
+    to call Token.clear(). (Doron Cohen)   
+
+14. LUCENE-1118: Changed StandardAnalyzer to skip too-long (default >
+    255 characters) tokens.  You can increase this limit by calling
+    StandardAnalyzer.setMaxTokenLength(...).  (Michael McCandless)
+
+
+Bug fixes
+
+ 1. LUCENE-933: QueryParser fixed to not produce empty sub 
+    BooleanQueries "()" even if the Analyzer produced no 
+    tokens for input. (Doron Cohen)
+
+ 2. LUCENE-955: Fixed SegmentTermPositions to work correctly with the
+    first term in the dictionary. (Michael Busch)
+
+ 3. LUCENE-951: Fixed NullPointerException in MultiLevelSkipListReader
+    that was thrown after a call of TermPositions.seek(). 
+    (Rich Johnson via Michael Busch)
+    
+ 4. LUCENE-938: Fixed cases where an unhandled exception in
+    IndexWriter's methods could cause deletes to be lost.
+    (Steven Parkes via Mike McCandless)
+      
+ 5. LUCENE-962: Fixed case where an unhandled exception in
+    IndexWriter.addDocument or IndexWriter.updateDocument could cause
+    unreferenced files in the index to not be deleted
+    (Steven Parkes via Mike McCandless)
+  
+ 6. LUCENE-957: RAMDirectory fixed to properly handle directories
+    larger than Integer.MAX_VALUE. (Doron Cohen)
+
+ 7. LUCENE-781: MultiReader fixed to not throw NPE if isCurrent(),
+    isOptimized() or getVersion() is called. Separated MultiReader
+    into two classes: MultiSegmentReader extends IndexReader, is
+    package-protected and is created automatically by IndexReader.open()
+    in case the index has multiple segments. The public MultiReader 
+    now extends MultiSegmentReader and is intended to be used by users
+    who want to add their own subreaders. (Daniel Naber, Michael Busch)
+
+ 8. LUCENE-970: FilterIndexReader now implements isOptimized(). Before
+    a call of isOptimized() would throw a NPE. (Michael Busch)
+
+ 9. LUCENE-832: ParallelReader fixed to not throw NPE if isCurrent(),
+    isOptimized() or getVersion() is called. (Michael Busch)
+      
+10. LUCENE-948: Fix FNFE exception caused by stale NFS client
+    directory listing caches when writers on different machines are
+    sharing an index over NFS and using a custom deletion policy (Mike
+    McCandless)
+
+11. LUCENE-978: Ensure TermInfosReader, FieldsReader, and FieldsReader
+    close any streams they had opened if an exception is hit in the
+    constructor.  (Ning Li via Mike McCandless)
+
+12. LUCENE-985: If an extremely long term is in a doc (> 16383 chars),

[... 2220 lines stripped ...]


Mime
View raw message