lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: SnowballAnalyzer
Date Sun, 12 Oct 2003 14:10:55 GMT
I don't know if I replied to this or not.  My opinion below.

--- Erik Hatcher <> wrote:
> On Tuesday, October 7, 2003, at 05:25  AM, Otis Gospodnetic wrote:
> > My vote goes to leaving it in the Sandbox, for the same reasons I
> > mentioned the other day for some other similar component.
> >
> > As a matter of fact, I have been wondering if we should move
> Russian
> > and German code out of the core into the Sandbox.
> I would be +1 on moving it out too.  But where do you draw the line
> on what Analyzers go in the core, then?

I would keep only the 'core' ones (Whitespace/Simple/Standard), even if
they have English-specific code in them.  I hate assuming English as
THE language, even though it is THE language in practise, but I don't
see a better way of keeping language-specific code out of the core.  In
my opinion, an ideal setup would be to keep the W/S/S in the core, and
all others in a Sandbox.  Analyzers in the Sandbox would be nicely
organized and would be in a stable state, so a simple 'ant jar' can
package everything up and let the developer just move the created Jar
to the appropriate directory in his environment.
I would keep those contributed Analyzers separate from the Snowball
ones, so their origins, etc. are clear.

I have a Brasilian Portuguese Analyzer sitting in the queue (read: my
email account's inbox), and I think I even have some code that somebody
sent for Chinese, other than CJK support from Che Dong.  This has been
waiting for my free time for months now... :(


Do you Yahoo!?
The New Yahoo! Shopping - with improved product search

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message