lucene-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-java Wiki] Update of "ReleaseNote80" by jpountz
Date Wed, 13 Mar 2019 13:59:54 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification.

The "ReleaseNote80" page has been changed by jpountz:
https://wiki.apache.org/lucene-java/ReleaseNote80?action=diff&rev1=10&rev2=11

  
  https://lucene.apache.org/core/8_0_0/changes/Changes.html
  
- Lucene 8.0.0 Release Highlights:
+ == Lucene 8.0.0 Release Highlights: ==
-  * Indices that were created before the previous major version will now fail to open even
if they have been merged with the previous major version.
-  * Codecs now have the ability to index score impacts.
-  * Queries now support faster collection of top hits when the total hit count is not required.
+ 
+ === Query execution ===
+ 
+ Term queries, phrase queries and boolean queries introduced new optimization that enables
efficient skipping over non-competitive documents when the total hit count is not needed.
Depending on the exact query and data distribution, queries might run between a few percents
slower and many times faster, especially term queries and pure disjunctions.
+ 
+ In order to support this enhancement, some API changes have been made:
+  * `TopDocs.totalHits` is no longer a long but an object that gives a lower bound of the
actual hit count.
-  * IndexSearcher's search and searchAfter methods now only compute total hit counts accurately
up to 1,000 in order to enable top-hits optimizations.
+  * `IndexSearcher`'s `search` and `searchAfter` methods now only compute total hit counts
accurately up to 1,000 in order to enable this optimization by default.
-  * Queries are now required to produce positive scores.
+  * Queries are now required to produce non-negative scores.
-  * FSTs can now remain off-heap, accessed via IndexInput, and the default codec's term dictionary
will now leave the FST for the terms index off-heap for non-primary-key fields using MMapDirectory,
reducing heap usage for such fields.
-  * Index-time jump-tables for DocValues, for O(1) advance when retrieving doc values.
+ 
+ === Codecs ===
+ 
+  * Postings now index score impacts alongside skip data. This is how term queries optimize
collection of top hits when hit counts are not needed.
+  * Doc values introduced jump tables, so that advancing runs in constant time. This is especially
helpful on sparse fields.
+  * The terms index `FST` is now loaded off-heap for non-primary-key fields using `MMapDirectory`,
reducing heap usage for such fields.
+ 
+ === Custom scoring ===
+ 
+ The new `FeatureField` allows efficient integration of static features such as a pagerank
into the score. Furthermore, the new `LongPoint#newDistanceFeatureQuery` and `LatLonPoint#newDistanceFeatureQuery`
methods allow boosting by recency and geo-distance respectively. These new helpers are optimized
for the case when total hit counts are not needed. For instance if the pagerank has a significant
weight in your scores, then Lucene might be able to skip over documents that have a low pagerank
value.
+ 
  
  Further details of changes are available in the change log available at: http://lucene.apache.org/core/8_0_0/changes/Changes.html
  

Mime
View raw message