lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: [ANNOUNCE] Apache Lucene 5.0.0 released
Date Fri, 20 Feb 2015 21:10:03 GMT
Many thanks! :-) Nice work!

I found a small typo in the announcement text on the mail and web page: " Those indexes can
then be read (see next section) with Lucene 5..."
The "see next section" should not be there, it's only relevant in the migration guide (because
there is a section following). Maybe fix this on the web page, for the mail it's too late.


Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen

> -----Original Message-----
> From: Anshum Gupta []
> Sent: Friday, February 20, 2015 9:55 PM
> To:;; java-
> Subject: [ANNOUNCE] Apache Lucene 5.0.0 released
> 20 February 2015, Apache Lucene™ 5.0.0 available
> The Lucene PMC is pleased to announce the release of Apache Lucene 5.0.
> Apache Lucene is a high-performance, full-featured text search engine
> library written entirely in Java. It is a technology suitable for nearly any
> application that requires full-text search, especially cross-platform.
> This release contains numerous bug fixes, optimizations, and improvements,
> some of which are highlighted below. The release is available for immediate
> download at:
> See the CHANGES.txt file included with the release for a full list of details.
> Lucene 5.0 Release Highlights:
> Stronger index safety
>  * All file access now uses Java’s NIO.2 APIs which give Lucene stronger index
> safety in terms of better error handling and safer commits.
>  * Every Lucene segment now stores a unique id per-segment and per-
> commit to aid in accurate replication of index files.
>  * During merging, IndexWriter now always checks the incoming segments
> for corruption before merging. This can mean, on upgrading to 5.0.0, that
> merging may uncover long-standing latent corruption in an older 4.x index.
> Reduced heap usage
>  * Lucene now supports random-writable and advance-able sparse bitsets
> (RoaringDocIdSet and SparseFixedBitSet), so the heap required is in
> proportion to how many bits are set, not how many total documents exist in
> the index.
>  * Heap usage during IndexWriter merging is also much lower with the new
> Lucene50Codec, since doc values and norms for the segments being merged
> are no longer fully loaded into heap for all fields; now they are loaded for the
> one field currently being merged, and then dropped.
>  * The default norms format now uses sparse encoding when appropriate, so
> indices that enable norms for many sparse fields will see a large reduction in
> required heap at search time.
>  * 5.0 has a new API to print a tree structure showing a recursive breakdown
> of which parts are using how much heap.
> Other features
>  * FieldCache is gone (moved to a dedicated UninvertingReader in the misc
> module). This means when you intend to sort on a field, you should index
> that field using doc values, which is much faster and less heap consuming
> than FieldCache.
>  * Tokenizers and Analyzers no longer require Reader on init.
>  * NormsFormat now gets its own dedicated NormsConsumer/Producer
>  * SortedSetSortField, used to sort on a multi-valued field, is promoted from
> sandbox to Lucene's core.
>  * PostingsFormat now uses a "pull" API when writing postings, just like doc
> values. This is powerful because you can do things in your postings format
> that require making more than one pass through the postings such as
> iterating over all postings for each term to decide which compression format
> it should use.
>  * New DateRangeField type enables Indexing and searching of date ranges,
> particularly multi-valued ones.
>  * A new ExitableDirectoryReader extends FilterDirectoryReader and enables
> exiting requests that take too long to enumerate over terms.
>  * Suggesters from multi-valued field can now be built as
> DocumentDictionary now enumerates each value separately in a multi-
> valued field.
>  * ConcurrentMergeScheduler detects whether the index is on SSD or not
> and does a better job defaulting its settings. This only works on Linux for
> now; other OS's will continue to use the previous defaults (tuned for
> spinning disks).
>  * Auto-IO-throttling has been added to ConcurrentMergeScheduler, to rate
> limit IO writes for each merge depending on incoming merge rate.
>  * CustomAnalyzer has been added that allows to configure analyzers like
> you do in Solr's index schema. This class has a builder API to configure
> Tokenizers, TokenFilters, and CharFilters based on their SPI names and
> parameters as documented by the corresponding factories.
>  * Memory index now supports payloads.
>  * Added a filter cache with a usage tracking policy that caches filters based
> on frequency of use.
>  * The default codec has an option to control BEST_SPEED or
> BEST_COMPRESSION for stored fields.
>  * Stored fields are merged more efficiently, especially when upgrading from
> previous versions or using SortingMergePolicy
> NOTE: Lucene 5 no longer supports the Lucene 3.x index format. Opening
> indexes will result in IndexFormatTooOldException. It is recommended to
> either reindex all your data, or upgrade the old indexes with the
> IndexUpgrader tool of latest Lucene 4 version (4.10.x). Those indexes can
> then be read (see next section) with Lucene 5.
> To read more about the changes, also see:
> coming.html
> Please read CHANGES.txt (
> and
> MIGRATE.txt for a full list of new features and notes on upgrading.
> Please report any feedback to the mailing lists (
> --
> Anshum Gupta

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message