lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject snowball discussion on LUCENE-2285
Date Sat, 27 Feb 2010 15:35:59 GMT
i wanted to continue this here to not clog up the issue!

Shai Erera commented on LUCENE-2285:

> bq. I'd be curious to know what you did
>
> Ok, now you've made me compare the two :). I'm happy to see we both did the
> same thing, only you call your buffer 'current' while I call it 'buf'.
> Besides that I've included a static final EMPTY_ARGS instead of all the
> places where 'new Object[0]' is passed. Nothing too fancy.
>

hmm, i didnt think of this second optimization, does it affect generated
code or is it in Among/SnowballProgram?

>
> Another thing is that I wrote an Arabic and Hebrew stemmer, and combined
> them w/ the Snowball ones by introducing a stemmer class which can be either
> Snowball or anything else. I'll check if we're allowed to contribute the
> Hebrew stemmer to Lucene ...
>

please do.  as far as integration goes, i guess we took a different approach
with LUCENE-2055 (from the Analyzer perspective, the user does not care if
it uses snowball or something else behind the scenes, etc).


> BTW FYI - our legal department forbid us to use the Hungarian stemmer
> because of licensing/legal issues. Besides the stemmers that were originally
> provided, the Snowball project accepted additional ones like the Hungarian
> stemmer. However, for that one we weren't able to get a confirmation from
> the contributor his University indeed gave him permission to contribute the
> code. Don't know if it matters to anyone here (I've notified Martin Porter
> as well), but FYI. Our legal department does not permit us to use it (which
> is not surprising - they are legal ...). I don't want to derail this issue
> into Snowball discussion, so if you want to talk about it, I suggest we move
> it to the list.


this is concerning to me, i mean the thing is sitting there on the
universities website: http://ilps.science.uva.nl/resources/snowball-hun :)
but if apache is concerned about this situation too, someone let me know and
i can take savoy's (clearly marked BSD) and we can add that instead, and
remove the ambiguous snowball one, even if its temporary:
http://members.unine.ch/jacques.savoy/clef/index.html



-- 
Robert Muir
rcmuir@gmail.com

Mime
View raw message