lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: Lucene 1.4 - lobby for final release
Date Fri, 26 Mar 2004 22:19:19 GMT
On Mar 26, 2004, at 3:32 PM, Stephane James Vaucher wrote:
> I'm personally a fan of a release small but often approach, but what 
> are
> the new features available in 1.4 (a list would be nice, on the wiki
> perhaps)? Will there be interim builds available to try these new 
> features
> out soon?

There is a CHANGES.txt in the root of the jakarta-lucene CVS repository 
that stays pretty much current and accurate.  I'm pasting it below for 
the 1.3 -> CVS HEAD changes.

> There seem to be no nightly builds on:

I guess at this time you will have to build it yourself from CVS.  
There is one show-stopper before we can release an RC1.  We must fully 
convert to ASL 2.0 (meaning every single source file needs the license 
header as well as any other files that can be tagged with it).  I know 
Otis has changed some files, but we need a full sweep.  There have been 
some utilities posted in a committers area to facilitate this change 
more automatically if we want to use them.


excerpt from CHANGES.txt

1.4 RC1

  1. Changed the format of the .tis file, so that:

     - it has a format version number, which makes it easier to
       back-compatibly change file formats in the future.

     - the term count is now stored as a long.  This was the one aspect
       of the Lucene's file formats which limited index size.

     - a few internal index parameters are now stored in the index, so
       that they can (in theory) now be changed from index to index,
       although there is not yet an API to do so.

     These changes are back compatible.  The new code can read old
     indexes.  But old code will not be able read new indexes. (cutting)

  2. Added an optimized implementation of TermDocs.skipTo().  A skip
     table is now stored for each term in the .frq file.  This only
     adds a percent or two to overall index size, but can substantially
     speedup many searches.  (cutting)

  3. Restructured the Scorer API and all Scorer implementations to take
     advantage of an optimized TermDocs.skipTo() implementation.  In
     particular, PhraseQuerys and conjunctive BooleanQuerys are
     faster when one clause has substantially fewer matches than the
     others.  (A conjunctive BooleanQuery is a BooleanQuery where all
     clauses are required.)  (cutting)

  4. Added new class ParallelMultiSearcher.  Combined with
     RemoteSearchable this makes it easy to implement distributed
     search systems.  (Jean-Francois Halleux via cutting)

  5. Added support for hit sorting.  Results may now be sorted by any
     indexed field.  For details see the javadoc for
     Searcher#search(Query, Sort).  (Tim Jones via Cutting)

  6. Changed FSDirectory to auto-create a full directory tree that it
     needs by using mkdirs() instead of mkdir().  (Mladen Turk via Otis)

  7. Added a new span-based query API.  This implements, among other
     things, nested phrases.  See javadocs for details.  (Doug Cutting)

  8. Added new method Query.getSimilarity(Searcher), and changed
     scorers to use it.  This permits one to subclass a Query class so
     that it can specify it's own Similarity implementation, perhaps
     one that delegates through that of the Searcher.  (Julien Nioche
     via Cutting)

  9. Added MultiReader, an IndexReader that combines multiple other
     IndexReaders.  (Cutting)

10. Added support for term vectors.  See Field#isTermVectorStored().
     (Grant Ingersoll, Cutting & Dmitry)

11. Fixed the old bug with escaping of special characters in query
     (Jean-Francois Halleux via Otis)

12. Added support for overriding default values for the following,
     using system properties:
       - default commit lock timeout
       - default maxFieldLength
       - default maxMergeDocs
       - default mergeFactor
       - default minMergeDocs
       - default write lock timeout

13. Changed QueryParser.jj to allow '-' and '+' within tokens:
     (Morus Walter via Otis)

14. Changed so that the compound index format is used by default.
     This makes indexing a bit slower, but vastly reduces the chances
     of file handle problems.  (Cutting)

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message