lucy-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: Release date and language bindings
Date Wed, 25 Nov 2009 18:36:42 GMT
On Wed, Nov 25, 2009 at 11:44:53AM +0200, fgl wrote:

> I'm interested in trying out Lucy when it becomes available.  

:)

> I've had a look at the dev list and there doesn't appear to be a firm
> release date (which is understandable, considering only one or two devs).
> 
> Just to get an idea, are we talking about a year or two, a few months,
> weeks...?

Months.  It depends somewhat on how many new features get fast-tracked ahead
of finishing the port, but getting a larger community involved with the
software we're using is important to the people sponsoring my work on Lucy, so
I don't anticipate that slipping too much.

> In terms of the index format:  is it going to be Lucene compatible or
> completely new (with similarities)?

Completely new, with similarities.  The Lucene index format has many quirks
and elaborate optimizations and is impractical to implement unless you're
writing a nearly line-by-line port a la Lucene.NET or Clucene.  If anything,
there will be modules for Java Lucene to read Lucy indexes first.  We're
trying to emphasize simplicity in the file format design to aid such
interchange, for instance by encoding all metadata as JSON.

> Also, are there plans to implement language bindings so that (eg) Perl can
> be used to index with and PHP used to search with (as with other engines
> like www.xapian.org)?

Yes, that will be possible.  

Caveat: you'll need to check the documentation for your Analyzer to ensure
that it works independently of the host language.  For instance, some
tokenizers will use the host's regex engine, and there will be differences
between regex engines which will make indexes incompatible.  (Java Lucene has
similar issues when transitioning between JRE versions.)

Marvin Humphrey


Mime
View raw message