lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From uschind...@apache.org
Subject svn commit: r1379946 - /lucene/dev/trunk/lucene/MIGRATE.txt
Date Sun, 02 Sep 2012 11:36:22 GMT
Author: uschindler
Date: Sun Sep  2 11:36:22 2012
New Revision: 1379946

URL: http://svn.apache.org/viewvc?rev=1379946&view=rev
Log:
Lucene 5.0 currently needs no migration guide. A new one will start with LUCENE-3312.

Modified:
    lucene/dev/trunk/lucene/MIGRATE.txt

Modified: lucene/dev/trunk/lucene/MIGRATE.txt
URL: http://svn.apache.org/viewvc/lucene/dev/trunk/lucene/MIGRATE.txt?rev=1379946&r1=1379945&r2=1379946&view=diff
==============================================================================
--- lucene/dev/trunk/lucene/MIGRATE.txt (original)
+++ lucene/dev/trunk/lucene/MIGRATE.txt Sun Sep  2 11:36:22 2012
@@ -1,633 +1,3 @@
 # Apache Lucene Migration Guide
 
-## Four-dimensional enumerations
-
-Flexible indexing changed the low level fields/terms/docs/positions
-enumeration APIs.  Here are the major changes:
-
-  * Terms are now binary in nature (arbitrary byte[]), represented
-    by the BytesRef class (which provides an offset + length "slice"
-    into an existing byte[]).
-
-  * Fields are separately enumerated (Fields.iterator()) from the terms
-    within each field (TermEnum).  So instead of this:
-
-        TermEnum termsEnum = ...;
-        while(termsEnum.next()) {
-          Term t = termsEnum.term();
-          System.out.println("field=" + t.field() + "; text=" + t.text());
-        }
-
-    Do this:
-
-        for(String field : fields) {
-            TermsEnum termsEnum = fields.terms(field);
-            BytesRef text;
-            while((text = termsEnum.next()) != null) {
-              System.out.println("field=" + field + "; text=" + text.utf8ToString());
-          }
-        }
-
-  * TermDocs is renamed to DocsEnum.  Instead of this:
-
-        while(td.next()) {
-          int doc = td.doc();
-          ...
-        }
-
-    do this:
-
-        int doc;
-        while((doc = td.next()) != DocsEnum.NO_MORE_DOCS) {
-          ...
-        }
-
-    Instead of this:
-    
-        if (td.skipTo(target)) {
-          int doc = td.doc();
-          ...
-        }
-
-    do this:
-    
-        if ((doc=td.advance(target)) != DocsEnum.NO_MORE_DOCS) {
-          ...
-        }
-
-  * TermPositions is renamed to DocsAndPositionsEnum, and no longer
-    extends the docs only enumerator (DocsEnum).
-
-  * Deleted docs are no longer implicitly filtered from
-    docs/positions enums.  Instead, you pass a Bits
-    skipDocs (set bits are skipped) when obtaining the enums.  Also,
-    you can now ask a reader for its deleted docs.
-
-  * The docs/positions enums cannot seek to a term.  Instead,
-    TermsEnum is able to seek, and then you request the
-    docs/positions enum from that TermsEnum.
-
-  * TermsEnum's seek method returns more information.  So instead of
-    this:
-
-        Term t;
-        TermEnum termEnum = reader.terms(t);
-        if (t.equals(termEnum.term())) {
-          ...
-        }
-
-    do this:
-
-        TermsEnum termsEnum = ...;
-        BytesRef text;
-        if (termsEnum.seek(text) == TermsEnum.SeekStatus.FOUND) {
-          ...
-        }
-
-    SeekStatus also contains END (enumerator is done) and NOT_FOUND
-    (term was not found but enumerator is now positioned to the next
-    term).
-
-  * TermsEnum has an ord() method, returning the long numeric
-    ordinal (ie, first term is 0, next is 1, and so on) for the term
-    it's not positioned to.  There is also a corresponding seek(long
-    ord) method.  Note that these methods are optional; in
-    particular the MultiFields TermsEnum does not implement them.
-
-
-  * How you obtain the enums has changed.  The primary entry point is
-    the Fields class.  If you know your reader is a single segment
-    reader, do this:
-
-        Fields fields = reader.Fields();
-        if (fields != null) {
-          ...
-        }
-
-    If the reader might be multi-segment, you must do this:
-    
-        Fields fields = MultiFields.getFields(reader);
-        if (fields != null) {
-          ...
-        }
-  
-    The fields may be null (eg if the reader has no fields).
-
-    Note that the MultiFields approach entails a performance hit on
-    MultiReaders, as it must merge terms/docs/positions on the fly. It's
-    generally better to instead get the sequential readers (use
-    oal.util.ReaderUtil) and then step through those readers yourself,
-    if you can (this is how Lucene drives searches).
-
-    If you pass a SegmentReader to MultiFields.fields it will simply
-    return reader.fields(), so there is no performance hit in that
-    case.
-
-    Once you have a non-null Fields you can do this:
-
-        Terms terms = fields.terms("field");
-        if (terms != null) {
-          ...
-        }
-
-    The terms may be null (eg if the field does not exist).
-
-    Once you have a non-null terms you can get an enum like this:
-
-        TermsEnum termsEnum = terms.iterator();
-
-    The returned TermsEnum will not be null.
-
-    You can then .next() through the TermsEnum, or seek.  If you want a
-    DocsEnum, do this:
-
-        Bits liveDocs = reader.getLiveDocs();
-        DocsEnum docsEnum = null;
-
-        docsEnum = termsEnum.docs(liveDocs, docsEnum, needsFreqs);
-
-    You can pass in a prior DocsEnum and it will be reused if possible.
-
-    Likewise for DocsAndPositionsEnum.
-
-    IndexReader has several sugar methods (which just go through the
-    above steps, under the hood).  Instead of:
-
-        Term t;
-        TermDocs termDocs = reader.termDocs();
-        termDocs.seek(t);
-
-    do this:
-
-        String field;
-        BytesRef text;
-        DocsEnum docsEnum = reader.termDocsEnum(reader.getLiveDocs(), field, text, needsFreqs);
-
-    Likewise for DocsAndPositionsEnum.
-
-## LUCENE-2380: FieldCache.getStrings/Index --> FieldCache.getDocTerms/Index
-
-  * The field values returned when sorting by SortField.STRING are now
-    BytesRef.  You can call value.utf8ToString() to convert back to
-    string, if necessary.
-
-  * In FieldCache, getStrings (returning String[]) has been replaced
-    with getTerms (returning a FieldCache.DocTerms instance).
-    DocTerms provides a getTerm method, taking a docID and a BytesRef
-    to fill (which must not be null), and it fills it in with the
-    reference to the bytes for that term.
-
-    If you had code like this before:
-
-        String[] values = FieldCache.DEFAULT.getStrings(reader, field);
-        ...
-        String aValue = values[docID];
-
-    you can do this instead:
-
-        DocTerms values = FieldCache.DEFAULT.getTerms(reader, field);
-        ...
-        BytesRef term = new BytesRef();
-        String aValue = values.getTerm(docID, term).utf8ToString();
-
-    Note however that it can be costly to convert to String, so it's
-    better to work directly with the BytesRef.
-
-  * Similarly, in FieldCache, getStringIndex (returning a StringIndex
-    instance, with direct arrays int[] order and String[] lookup) has
-    been replaced with getTermsIndex (returning a
-    FieldCache.DocTermsIndex instance).  DocTermsIndex provides the
-    getOrd(int docID) method to lookup the int order for a document,
-    lookup(int ord, BytesRef reuse) to lookup the term from a given
-    order, and the sugar method getTerm(int docID, BytesRef reuse)
-    which internally calls getOrd and then lookup.
-
-    If you had code like this before:
-
-        StringIndex idx = FieldCache.DEFAULT.getStringIndex(reader, field);
-        ...
-        int ord = idx.order[docID];
-        String aValue = idx.lookup[ord];
-
-    you can do this instead:
-
-        DocTermsIndex idx = FieldCache.DEFAULT.getTermsIndex(reader, field);
-        ...
-        int ord = idx.getOrd(docID);
-        BytesRef term = new BytesRef();
-        String aValue = idx.lookup(ord, term).utf8ToString();
-
-    Note however that it can be costly to convert to String, so it's
-    better to work directly with the BytesRef.
-
-    DocTermsIndex also has a getTermsEnum() method, which returns an
-    iterator (TermsEnum) over the term values in the index (ie,
-    iterates ord = 0..numOrd()-1).
-
-  * StringComparatorLocale is now more CPU costly than it was before
-    (it was already very CPU costly since it does not compare using
-    indexed collation keys; use CollationKeyFilter for better
-    performance), since it converts BytesRef -> String on the fly.
-    Also, the field values returned when sorting by SortField.STRING
-    are now BytesRef.
-
-  * FieldComparator.StringOrdValComparator has been renamed to
-    TermOrdValComparator, and now uses BytesRef for its values.
-    Likewise for StringValComparator, renamed to TermValComparator.
-    This means when sorting by SortField.STRING or
-    SortField.STRING_VAL (or directly invoking these comparators) the
-    values returned in the FieldDoc.fields array will be BytesRef not
-    String.  You can call the .utf8ToString() method on the BytesRef
-    instances, if necessary.
-
-## LUCENE-2600: IndexReaders are now read-only
-
-  Instead of IndexReader.isDeleted, do this:
-
-      import org.apache.lucene.util.Bits;
-      import org.apache.lucene.index.MultiFields;
-
-      Bits liveDocs = MultiFields.getLiveDocs(indexReader);
-      if (!liveDocs.get(docID)) {
-        // document is deleted...
-      }
-    
-## LUCENE-2858, LUCENE-3733: IndexReader --> AtomicReader/CompositeReader/DirectoryReader
refactoring
-
-The abstract class IndexReader has been 
-refactored to expose only essential methods to access stored fields 
-during display of search results. It is no longer possible to retrieve 
-terms or postings data from the underlying index, not even deletions are 
-visible anymore. You can still pass IndexReader as constructor parameter 
-to IndexSearcher and execute your searches; Lucene will automatically 
-delegate procedures like query rewriting and document collection atomic 
-subreaders. 
-
-If you want to dive deeper into the index and want to write own queries, 
-take a closer look at the new abstract sub-classes AtomicReader and 
-CompositeReader: 
-
-AtomicReader instances are now the only source of Terms, Postings, 
-DocValues and FieldCache. Queries are forced to execute on a Atomic 
-reader on a per-segment basis and FieldCaches are keyed by 
-AtomicReaders. 
-
-Its counterpart CompositeReader exposes a utility method to retrieve 
-its composites. But watch out, composites are not necessarily atomic. 
-Next to the added type-safety we also removed the notion of 
-index-commits and version numbers from the abstract IndexReader, the 
-associations with IndexWriter were pulled into a specialized 
-DirectoryReader. To open Directory-based indexes use 
-DirectoryReader.open(), the corresponding method in IndexReader is now 
-deprecated for easier migration. Only DirectoryReader supports commits, 
-versions, and reopening with openIfChanged(). Terms, postings, 
-docvalues, and norms can from now on only be retrieved using 
-AtomicReader; DirectoryReader and MultiReader extend CompositeReader, 
-only offering stored fields and access to the sub-readers (which may be 
-composite or atomic). 
-
-If you have more advanced code dealing with custom Filters, you might 
-have noticed another new class hierarchy in Lucene (see LUCENE-2831): 
-IndexReaderContext with corresponding Atomic-/CompositeReaderContext. 
-
-The move towards per-segment search Lucene 2.9 exposed lots of custom 
-Queries and Filters that couldn't handle it. For example, some Filter 
-implementations expected the IndexReader passed in is identical to the 
-IndexReader passed to IndexSearcher with all its advantages like 
-absolute document IDs etc. Obviously this "paradigm-shift" broke lots of 
-applications and especially those that utilized cross-segment data 
-structures (like Apache Solr). 
-
-In Lucene 4.0, we introduce IndexReaderContexts "searcher-private" 
-reader hierarchy. During Query or Filter execution Lucene no longer 
-passes raw readers down Queries, Filters or Collectors; instead 
-components are provided an AtomicReaderContext (essentially a hierarchy 
-leaf) holding relative properties like the document-basis in relation to 
-the top-level reader. This allows Queries & Filter to build up logic 
-based on document IDs, albeit the per-segment orientation. 
-
-There are still valid use-cases where top-level readers ie. "atomic 
-views" on the index are desirable. Let say you want to iterate all terms 
-of a complete index for auto-completion or faceting, Lucene provides
-utility wrappers like SlowCompositeReaderWrapper (LUCENE-2597) emulating 
-an AtomicReader. Note: using "atomicity emulators" can cause serious 
-slowdowns due to the need to merge terms, postings, DocValues, and 
-FieldCache, use them with care! 
-
-## LUCENE-4306: getSequentialSubReaders(), ReaderUtil.Gather
-
-The method IndexReader#getSequentialSubReaders() was moved to CompositeReader
-(see LUCENE-2858, LUCENE-3733) and made protected. It is solely used by
-CompositeReader itself to build its reader tree. To get all atomic leaves
-of a reader, use IndexReader#leaves(), which also provides the doc base
-of each leave. Readers that are already atomic return itself as leaf with
-doc base 0. To emulate Lucene 3.x getSequentialSubReaders(),
-use getContext().children().
-
-## LUCENE-2413,LUCENE-3396: Analyzer package changes
-
-Lucene's core and contrib analyzers, along with Solr's analyzers,
-were consolidated into lucene/analysis. During the refactoring some
-package names have changed, and ReusableAnalyzerBase was renamed to
-Analyzer:
-
-  - o.a.l.analysis.KeywordAnalyzer -> o.a.l.analysis.core.KeywordAnalyzer
-  - o.a.l.analysis.KeywordTokenizer -> o.a.l.analysis.core.KeywordTokenizer
-  - o.a.l.analysis.LetterTokenizer -> o.a.l.analysis.core.LetterTokenizer
-  - o.a.l.analysis.LowerCaseFilter -> o.a.l.analysis.core.LowerCaseFilter
-  - o.a.l.analysis.LowerCaseTokenizer -> o.a.l.analysis.core.LowerCaseTokenizer
-  - o.a.l.analysis.SimpleAnalyzer -> o.a.l.analysis.core.SimpleAnalyzer
-  - o.a.l.analysis.StopAnalyzer -> o.a.l.analysis.core.StopAnalyzer
-  - o.a.l.analysis.StopFilter -> o.a.l.analysis.core.StopFilter
-  - o.a.l.analysis.WhitespaceAnalyzer -> o.a.l.analysis.core.WhitespaceAnalyzer
-  - o.a.l.analysis.WhitespaceTokenizer -> o.a.l.analysis.core.WhitespaceTokenizer
-  - o.a.l.analysis.PorterStemFilter -> o.a.l.analysis.en.PorterStemFilter
-  - o.a.l.analysis.ASCIIFoldingFilter -> o.a.l.analysis.miscellaneous.ASCIIFoldingFilter
-  - o.a.l.analysis.ISOLatin1AccentFilter -> o.a.l.analysis.miscellaneous.ISOLatin1AccentFilter
-  - o.a.l.analysis.KeywordMarkerFilter -> o.a.l.analysis.miscellaneous.KeywordMarkerFilter
-  - o.a.l.analysis.LengthFilter -> o.a.l.analysis.miscellaneous.LengthFilter
-  - o.a.l.analysis.PerFieldAnalyzerWrapper -> o.a.l.analysis.miscellaneous.PerFieldAnalyzerWrapper
-  - o.a.l.analysis.TeeSinkTokenFilter -> o.a.l.analysis.sinks.TeeSinkTokenFilter
-  - o.a.l.analysis.CharFilter -> o.a.l.analysis.charfilter.CharFilter
-  - o.a.l.analysis.BaseCharFilter -> o.a.l.analysis.charfilter.BaseCharFilter
-  - o.a.l.analysis.MappingCharFilter -> o.a.l.analysis.charfilter.MappingCharFilter
-  - o.a.l.analysis.NormalizeCharMap -> o.a.l.analysis.charfilter.NormalizeCharMap
-  - o.a.l.analysis.CharArraySet -> o.a.l.analysis.util.CharArraySet
-  - o.a.l.analysis.CharArrayMap -> o.a.l.analysis.util.CharArrayMap
-  - o.a.l.analysis.ReusableAnalyzerBase -> o.a.l.analysis.Analyzer
-  - o.a.l.analysis.StopwordAnalyzerBase -> o.a.l.analysis.util.StopwordAnalyzerBase
-  - o.a.l.analysis.WordListLoader -> o.a.l.analysis.util.WordListLoader
-  - o.a.l.analysis.CharTokenizer -> o.a.l.analysis.util.CharTokenizer
-  - o.a.l.util.CharacterUtils -> o.a.l.analysis.util.CharacterUtils
-
-## LUCENE-2514: Collators
-
-The option to use a Collator's order (instead of binary order) for
-sorting and range queries has been moved to lucene/queries.
-The Collated TermRangeQuery/Filter has been moved to SlowCollatedTermRangeQuery/Filter, 
-and the collated sorting has been moved to SlowCollatedStringComparator.
-
-Note: this functionality isn't very scalable and if you are using it, consider 
-indexing collation keys with the collation support in the analysis module instead.
-
-To perform collated range queries, use a suitable collating analyzer: CollationKeyAnalyzer

-or ICUCollationKeyAnalyzer, and set qp.setAnalyzeRangeTerms(true).
-
-TermRangeQuery and TermRangeFilter now work purely on bytes. Both have helper factory methods
-(newStringRange) similar to the NumericRange API, to easily perform range queries on Strings.
-
-## LUCENE-2883: ValueSource changes
-
-Lucene's o.a.l.search.function ValueSource based functionality, was consolidated
-into lucene/queries along with Solr's similar functionality.  The following classes were
moved:
-
- - o.a.l.search.function.CustomScoreQuery -> o.a.l.queries.CustomScoreQuery
- - o.a.l.search.function.CustomScoreProvider -> o.a.l.queries.CustomScoreProvider
- - o.a.l.search.function.NumericIndexDocValueSource -> o.a.l.queries.function.valuesource.NumericIndexDocValueSource
-
-The following lists the replacement classes for those removed:
-
- - o.a.l.search.function.ByteFieldSource -> o.a.l.queries.function.valuesource.ByteFieldSource
- - o.a.l.search.function.DocValues -> o.a.l.queries.function.DocValues
- - o.a.l.search.function.FieldCacheSource -> o.a.l.queries.function.valuesource.FieldCacheSource
- - o.a.l.search.function.FieldScoreQuery ->o.a.l.queries.function.FunctionQuery
- - o.a.l.search.function.FloatFieldSource -> o.a.l.queries.function.valuesource.FloatFieldSource
- - o.a.l.search.function.IntFieldSource -> o.a.l.queries.function.valuesource.IntFieldSource
- - o.a.l.search.function.OrdFieldSource -> o.a.l.queries.function.valuesource.OrdFieldSource
- - o.a.l.search.function.ReverseOrdFieldSource -> o.a.l.queries.function.valuesource.ReverseOrdFieldSource
- - o.a.l.search.function.ShortFieldSource -> o.a.l.queries.function.valuesource.ShortFieldSource
- - o.a.l.search.function.ValueSource -> o.a.l.queries.function.ValueSource
- - o.a.l.search.function.ValueSourceQuery -> o.a.l.queries.function.FunctionQuery
-
-DocValues are now named FunctionValues, to not confuse with Lucene's per-document values.
-
-## LUCENE-2392: Enable flexible scoring
-
-The existing "Similarity" api is now TFIDFSimilarity, if you were extending
-Similarity before, you should likely extend this instead.
-
-Weight.normalize no longer takes a norm value that incorporates the top-level
-boost from outer queries such as BooleanQuery, instead it takes 2 parameters,
-the outer boost (topLevelBoost) and the norm. Weight.sumOfSquaredWeights has
-been renamed to Weight.getValueForNormalization().
-
-The scorePayload method now takes a BytesRef. It is never null.
-
-## LUCENE-3283: Query parsers moved to separate module
-
-Lucene's core o.a.l.queryParser QueryParsers have been consolidated into lucene/queryparser,
-where other QueryParsers from the codebase will also be placed.  The following classes were
moved:
-
-  - o.a.l.queryParser.CharStream -> o.a.l.queryparser.classic.CharStream
-  - o.a.l.queryParser.FastCharStream -> o.a.l.queryparser.classic.FastCharStream
-  - o.a.l.queryParser.MultiFieldQueryParser -> o.a.l.queryparser.classic.MultiFieldQueryParser
-  - o.a.l.queryParser.ParseException -> o.a.l.queryparser.classic.ParseException
-  - o.a.l.queryParser.QueryParser -> o.a.l.queryparser.classic.QueryParser
-  - o.a.l.queryParser.QueryParserBase -> o.a.l.queryparser.classic.QueryParserBase
-  - o.a.l.queryParser.QueryParserConstants -> o.a.l.queryparser.classic.QueryParserConstants
-  - o.a.l.queryParser.QueryParserTokenManager -> o.a.l.queryparser.classic.QueryParserTokenManager
-  - o.a.l.queryParser.QueryParserToken -> o.a.l.queryparser.classic.Token
-  - o.a.l.queryParser.QueryParserTokenMgrError -> o.a.l.queryparser.classic.TokenMgrError
-
-## LUCENE-2308, LUCENE-3453: Separate IndexableFieldType from Field instances
-
-With this change, the indexing details (indexed, tokenized, norms,
-indexOptions, stored, etc.) are moved into a separate FieldType
-instance (rather than being stored directly on the Field).
-
-This means you can create the FieldType instance once, up front,
-for a given field, and then re-use that instance whenever you instantiate
-the Field.
-
-Certain field types are pre-defined since they are common cases:
-
-  * StringField: indexes a String value as a single token (ie, does
-    not tokenize).  This field turns off norms and indexes only doc
-    IDS (does not index term frequency nor positions).  This field
-    does not store its value, but exposes TYPE_STORED as well.
-  * TextField: indexes and tokenizes a String, Reader or TokenStream
-    value, without term vectors.  This field does not store its value,
-    but exposes TYPE_STORED as well.
-  * StoredField: field that stores its value
-  * DocValuesField: indexes the value as a DocValues field
-  * NumericField: indexes the numeric value so that NumericRangeQuery
-    can be used at search-time.
-
-If your usage fits one of those common cases you can simply
-instantiate the above class.  If you need to store the value, you can
-add a separate StoredField to the document, or you can use
-TYPE_STORED for the field:
-
-    Field f = new Field("field", "value", StringField.TYPE_STORED);
-
-Alternatively, if an existing type is close to what you want but you
-need to make a few changes, you can copy that type and make changes:
-
-    FieldType bodyType = new FieldType(TextField.TYPE_STORED);
-    bodyType.setStoreTermVectors(true);
-
-You can of course also create your own FieldType from scratch:
-
-    FieldType t = new FieldType();
-    t.setIndexed(true);
-    t.setStored(true);
-    t.setOmitNorms(true);
-    t.setIndexOptions(IndexOptions.DOCS_AND_FREQS);
-    t.freeze();
-
-FieldType has a freeze() method to prevent further changes.
-
-There is also a deprecated transition API, providing the same Index,
-Store, TermVector enums from 3.x, and Field constructors taking these
-enums.
-
-When migrating from the 3.x API, if you did this before:
-
-    new Field("field", "value", Field.Store.NO, Field.Indexed.NOT_ANALYZED_NO_NORMS)
-
-you can now do this:
-
-    new StringField("field", "value")
-
-(though note that StringField indexes DOCS_ONLY).
-
-If instead the value was stored:
-
-    new Field("field", "value", Field.Store.YES, Field.Indexed.NOT_ANALYZED_NO_NORMS)
-
-you can now do this:
-
-    new Field("field", "value", StringField.TYPE_STORED)
-
-If you didn't omit norms:
-
-    new Field("field", "value", Field.Store.YES, Field.Indexed.NOT_ANALYZED)
-
-you can now do this:
-
-    FieldType ft = new FieldType(StringField.TYPE_STORED);
-    ft.setOmitNorms(false);
-    new Field("field", "value", ft)
-
-If you did this before (value can be String or Reader):
-
-    new Field("field", value, Field.Store.NO, Field.Indexed.ANALYZED)
-
-you can now do this:
-
-    new TextField("field", value)
-
-If instead the value was stored:
-
-    new Field("field", value, Field.Store.YES, Field.Indexed.ANALYZED)
-
-you can now do this:
-
-    new Field("field", value, TextField.TYPE_STORED)
-
-If in addition you omit norms:
-
-    new Field("field", value, Field.Store.YES, Field.Indexed.ANALYZED_NO_NORMS)
-
-you can now do this:
-
-    FieldType ft = new FieldType(TextField.TYPE_STORED);
-    ft.setOmitNorms(true);
-    new Field("field", value, ft)
-
-If you did this before (bytes is a byte[]):
-
-    new Field("field", bytes)
-
-you can now do this:
-
-    new StoredField("field", bytes)
-
-## Other changes
-
-* LUCENE-2674:
-  A new idfExplain method was added to Similarity, that
-  accepts an incoming docFreq.  If you subclass Similarity, make sure
-  you also override this method on upgrade, otherwise your
-  customizations won't run for certain MultiTermQuerys.
-
-* LUCENE-2691: The near-real-time API has moved from IndexWriter to
-  DirectoryReader.  Instead of IndexWriter.getReader(), call
-  DirectoryReader.open(IndexWriter) or DirectoryReader.openIfChanged(IndexWriter).
-
-* LUCENE-2690: MultiTermQuery boolean rewrites per segment.
-  Also MultiTermQuery.getTermsEnum() now takes an AttributeSource. FuzzyTermsEnum
-  is both consumer and producer of attributes: MTQ.BoostAttribute is
-  added to the FuzzyTermsEnum and MTQ's rewrite mode consumes it.
-  The other way round MTQ.TopTermsBooleanQueryRewrite supplies a
-  global AttributeSource to each segments TermsEnum. The TermsEnum is consumer
-  and gets the current minimum competitive boosts (MTQ.MaxNonCompetitiveBoostAttribute).
-
-* LUCENE-2374: The backwards layer in AttributeImpl was removed. To support correct
-  reflection of AttributeImpl instances, where the reflection was done using deprecated
-  toString() parsing, you have to now override reflectWith() to customize output.
-  toString() is no longer implemented by AttributeImpl, so if you have overridden
-  toString(), port your customization over to reflectWith(). reflectAsString() would
-  then return what toString() did before.
-
-* LUCENE-2236, LUCENE-2912: DefaultSimilarity can no longer be set statically 
-  (and dangerously) for the entire JVM.
-  Similarity can now be configured on a per-field basis (via PerFieldSimilarityWrapper)
-  Similarity has a lower-level API, if you want the higher-level vector-space API
-  like in previous Lucene releases, then look at TFIDFSimilarity.
-
-* LUCENE-1076: TieredMergePolicy is now the default merge policy.
-  It's able to merge non-contiguous segments; this may cause problems
-  for applications that rely on Lucene's internal document ID
-  assignment.  If so, you should instead use LogByteSize/DocMergePolicy
-  during indexing.
-
-* LUCENE-3722: Similarity methods and collection/term statistics now take
-  long instead of int (to enable distributed scoring of > 2B docs). 
-  For example, in TFIDFSimilarity idf(int, int) is now idf(long, long). 
-
-* LUCENE-3559: The methods "docFreq" and "maxDoc" on IndexSearcher were removed,
-  as these are no longer used by the scoring system.
-  If you were using these casually in your code for reasons unrelated to scoring,
-  call them on the IndexSearcher's reader instead: getIndexReader().
-  If you were subclassing IndexSearcher and overriding these methods to alter
-  scoring, override IndexSearcher's termStatistics() and collectionStatistics()
-  methods instead.
-
-* LUCENE-3396: Analyzer.tokenStream() and .reusableTokenStream() have been made final.
-  It is now necessary to use Analyzer.TokenStreamComponents to define an analysis process.
-  Analyzer also has its own way of managing the reuse of TokenStreamComponents (either
-  globally, or per-field).  To define another Strategy, implement Analyzer.ReuseStrategy.
-
-* LUCENE-3464: IndexReader.reopen has been renamed to
-  DirectoryReader.openIfChanged (a static method), and now returns null
-  (instead of the old reader) if there are no changes to the index, to
-  prevent the common pitfall of accidentally closing the old reader.
-  
-* LUCENE-3687: Similarity#computeNorm() now expects a Norm object to set the computed 
-  norm value instead of returning a fixed single byte value. Custom similarities can now
-  set integer, float and byte values if a single byte is not sufficient.
-
-* LUCENE-2621: Term vectors are now accessed via flexible indexing API.
-  If you used IndexReader.getTermFreqVector/s before, you should now
-  use IndexReader.getTermVectors.  The new method returns a Fields
-  instance exposing the inverted index of the one document.  From
-  Fields you can enumerate all fields, terms, positions, offsets.
-
-* LUCENE-4227: If you were previously using Instantiated index, you
-  may want to use DirectPostingsFormat after upgrading: it stores all
-  postings in simple arrrays (byte[] for terms, int[] for docs, freqs,
-  positions, offsets).  Note that this only covers postings, whereas
-  Instantiated covered all other parts of the index as well.
-
-* LUCENE-3309: The expert FieldSelector API has been replaced with
-  StoredFieldVisitor.  The idea is the same (you have full control
-  over which fields should be loaded).  Instead of a single accept
-  method, StoredFieldVisitor has a needsField method: if that method
-  returns true then the field will be loaded and the appropriate
-  type-specific method will be invoked with that fields's value.
-
-* LUCENE-4122: Removed the Payload class and replaced with BytesRef.
-  PayloadAttribute's name is unchanged, it just uses the BytesRef
-  class to refer to the payload bytes/start offset/end offset 
-  (or null if there is no payload).
+TODO: Lucene 5.0 currently has no migration guide.
\ No newline at end of file



Mime
View raw message