lucene-java-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Lucene-java Wiki] Update of "ReleaseNote40beta" by RobertMuir
Date Mon, 13 Aug 2012 13:23:46 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification.

The "ReleaseNote40beta" page has been changed by RobertMuir:
http://wiki.apache.org/lucene-java/ReleaseNote40beta?action=diff&rev1=4&rev2=5

Comment:
group related changes together and flesh out a bit

  
  Highlights of changes since 4.0-alpha:
  
-   * BloomFilteringPostingsFormat uses a bloom filter to sometimes
-     avoid disk seeks when looking up terms.  Performance gains (if any)
-     depend heavily on the use case.
- 
-   * JapaneseIterationMarkCharFilter normalizes Japanese iteration
-     marks.
- 
-   * DirectPostingsFormat holds all postings as simple byte[] and int[]
-     in memory, for very fast performance but also very high RAM
-     consumption.
- 
-   * Factories for creating Tokenizer, TokenFilter and CharFilter have
-     been moved from Solr to Lucene's analysis module.
- 
    * IndexWriter.tryDeleteDocument can sometimes delete by document
      ID, for higher performance in some applications.
+ 
+   * New experimental postings formats: BloomFilteringPostingsFormat uses 
+     a bloom filter to sometimes avoid disk seeks when looking up terms,
+     DirectPostingsFormat holds all postings as simple byte[] and int[]
+     for very fast performance at the cost of very high RAM consumption.
+ 
+   * CJK analysis improvements: JapaneseIterationMarkCharFilter normalizes 
+     Japanese iteration marks, added unigram+bigram support to CJKBigramFilter.
+ 
+   * Improvements to Scorer navigation API (Scorer.getChildren) to support
+     all queries, useful for determining which portions of the query matched.
+ 
+   * Analysis improvements: factories for creating Tokenizer, TokenFilter 
+     and CharFilter have been moved from Solr to Lucene's analysis module,
+     less memory overhead for StandardTokenizer and Snowball filters.
+ 
+   * Improved highlighting for multi-valued fields.
  
    * Various other API changes, optimizations and bug fixes.
  

Mime
View raw message