lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject snowball discussion on LUCENE-2285
Date Sat, 27 Feb 2010 15:35:59 GMT
i wanted to continue this here to not clog up the issue!

Shai Erera commented on LUCENE-2285:

> bq. I'd be curious to know what you did
> Ok, now you've made me compare the two :). I'm happy to see we both did the
> same thing, only you call your buffer 'current' while I call it 'buf'.
> Besides that I've included a static final EMPTY_ARGS instead of all the
> places where 'new Object[0]' is passed. Nothing too fancy.

hmm, i didnt think of this second optimization, does it affect generated
code or is it in Among/SnowballProgram?

> Another thing is that I wrote an Arabic and Hebrew stemmer, and combined
> them w/ the Snowball ones by introducing a stemmer class which can be either
> Snowball or anything else. I'll check if we're allowed to contribute the
> Hebrew stemmer to Lucene ...

please do.  as far as integration goes, i guess we took a different approach
with LUCENE-2055 (from the Analyzer perspective, the user does not care if
it uses snowball or something else behind the scenes, etc).

> BTW FYI - our legal department forbid us to use the Hungarian stemmer
> because of licensing/legal issues. Besides the stemmers that were originally
> provided, the Snowball project accepted additional ones like the Hungarian
> stemmer. However, for that one we weren't able to get a confirmation from
> the contributor his University indeed gave him permission to contribute the
> code. Don't know if it matters to anyone here (I've notified Martin Porter
> as well), but FYI. Our legal department does not permit us to use it (which
> is not surprising - they are legal ...). I don't want to derail this issue
> into Snowball discussion, so if you want to talk about it, I suggest we move
> it to the list.

this is concerning to me, i mean the thing is sitting there on the
universities website: :)
but if apache is concerned about this situation too, someone let me know and
i can take savoy's (clearly marked BSD) and we can add that instead, and
remove the ambiguous snowball one, even if its temporary:

Robert Muir

View raw message