Dear Wiki user, You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification. The "ReleaseNote80" page has been changed by jpountz: https://wiki.apache.org/lucene-java/ReleaseNote80?action=diff&rev1=10&rev2=11 https://lucene.apache.org/core/8_0_0/changes/Changes.html - Lucene 8.0.0 Release Highlights: + == Lucene 8.0.0 Release Highlights: == - * Indices that were created before the previous major version will now fail to open even if they have been merged with the previous major version. - * Codecs now have the ability to index score impacts. - * Queries now support faster collection of top hits when the total hit count is not required. + + === Query execution === + + Term queries, phrase queries and boolean queries introduced new optimization that enables efficient skipping over non-competitive documents when the total hit count is not needed. Depending on the exact query and data distribution, queries might run between a few percents slower and many times faster, especially term queries and pure disjunctions. + + In order to support this enhancement, some API changes have been made: + * `TopDocs.totalHits` is no longer a long but an object that gives a lower bound of the actual hit count. - * IndexSearcher's search and searchAfter methods now only compute total hit counts accurately up to 1,000 in order to enable top-hits optimizations. + * `IndexSearcher`'s `search` and `searchAfter` methods now only compute total hit counts accurately up to 1,000 in order to enable this optimization by default. - * Queries are now required to produce positive scores. + * Queries are now required to produce non-negative scores. - * FSTs can now remain off-heap, accessed via IndexInput, and the default codec's term dictionary will now leave the FST for the terms index off-heap for non-primary-key fields using MMapDirectory, reducing heap usage for such fields. - * Index-time jump-tables for DocValues, for O(1) advance when retrieving doc values. + + === Codecs === + + * Postings now index score impacts alongside skip data. This is how term queries optimize collection of top hits when hit counts are not needed. + * Doc values introduced jump tables, so that advancing runs in constant time. This is especially helpful on sparse fields. + * The terms index `FST` is now loaded off-heap for non-primary-key fields using `MMapDirectory`, reducing heap usage for such fields. + + === Custom scoring === + + The new `FeatureField` allows efficient integration of static features such as a pagerank into the score. Furthermore, the new `LongPoint#newDistanceFeatureQuery` and `LatLonPoint#newDistanceFeatureQuery` methods allow boosting by recency and geo-distance respectively. These new helpers are optimized for the case when total hit counts are not needed. For instance if the pagerank has a significant weight in your scores, then Lucene might be able to skip over documents that have a low pagerank value. + Further details of changes are available in the change log available at: http://lucene.apache.org/core/8_0_0/changes/Changes.html