lucene-ruby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Dudziak <tom...@gmail.com>
Subject Re: Ruby & Lucene & ApacheCon
Date Mon, 08 Aug 2005 19:46:35 GMT
On 8/8/05, Erik Hatcher <erik@ehatchersolutions.com> wrote:
> Thomas - have you had a look at PyLucene and how they do the gcj/SWIG
> wizardry?  What kinds of issues did you encounter with gcj?  Perhaps
> Andi Vajda from PyLucene could offer some advice?
> 
> I'd rather see the gcj/SWIG approach moving forward so that SWIG
> Lucene doesn't lag behind Java Lucene where all the innovation happens.

Yep, I tried to compile PyLucene on my Mac, but it failed because of
the Python version that comes with Mac OS 10.4 (which is 2.3). To be
fair to PyLucene, I only tried for a couple of hours as I don't really
have an interest in Python, I actually only wanted to see how they use
gcj.
But aside from that, I tried the PyLucene way first for a whole week.
First the issue of getting to run gcj on Mac OS X which ain't easy at
all - I had to install darwinports with a fresh gcc. Getting gcj to
run over Lucene is easy, works out of the box. But linking ruby with
swig-wrapped gcj-compiled lucene is not, all I got is a gcj internal
compiler error (with both gcc/gcj 3.4.3 and 4.0.1). This bug is in the
gcc bug list marked as a regression.
On Windows I had a similar amount of trouble using both MingW and
cygwin; I wasn't able to compile & link the stuff against ruby.

So to summarize, while there is definitely a strong argument for using
gcj to create other-language bindings from the Java-version, there are
a few issues that IMO make a strong case for CLucene:

* at best gcj is difficult to use; but on Windows & MacOS it is quite
involved and difficult. For me it was nearly impossible as I'm no
gcc/gcj expert

* it prevents or at least makes it extremely difficult to create
certain bindings such as COM and C# (perhaps except mono) as MingW is
not easily combined with VisualC++ AFAIK. And I don't think that there
is any chance of debugging such a combination when a problem arises.

* the amount of work necessary to swig-wrap the gcj-compiled Lucene to
a given target language is immense - just have a look at the swig file
of PyLucene and the Makefile to make the magic happen; I think this
must be a nightmare to maintain. I cannot really tell what amount of
work would be necessary for CLucene but since it is a straight C++
library and built with swig in mind, I would be surprised if it is not
a lot less

So from a technical point of view, it is my opinion that a pure C++
version is easier to maintain and evolve right now. I also think that
most of the innovation in Lucene is not Java-specific so while it
would be duplicated implementation work, the algorithms are the same
(or near enough). Also, a pure C++ version of Lucene gives it more
momentum IMO in both the Linux world (mbox_lucene or something similar
comes to mind) and the Microsoft world (.Net etc.)

> As for Lucene4C versus CLucene and moving CLucene to Apache - I'll
> let the c-dev@lucene list discuss it.  I'm happy to have CLucene at
> Apache too, though it seems simpler for us to only house a single
> implementation in C.  The gcj version would be ideal in my mind, but
> I'm also not skilled in gcj (and haven't touched C in decades,
> practically) - so it certainly is up to the actual coders where to go
> with it.

I don't know whether it is a "Lucene4C vs. CLucene" anyway. From what
I understand Lucene4C tries to create a simpler API for Lucene, and
while they are building on top of a gcj-compiled version of Java
Lucene, that is likely not a requirement (I don't think that they want
to expose any of the gcj-generated classes).
Besides, CLucene is quite far so from a practical point of view it
would make sense to use /maintain it. Being the practical guy that I
am, I think that any issues between Lucene4C, PyLucene, CLucene can be
worked out if the developers work together. After all, for all I know
it might even be possible to use a mixture of the Lucene4C API (for
plain C) and the CLucene API (for C++) in front of a gcj-compiled Java
Lucene, and all SWIG wrappers could then be build on top of this API.
At lest technically this is possible and perhaps even feasible.

regards,
Tom

Mime
View raw message