lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Tosovsky" <>
Subject A new Snowball stemmer
Date Sun, 01 Oct 2017 20:53:51 GMT
Dear All,

I'd like to integrate a new Snowball stemmer [1] to Lucene for my
experiments, but I can see some incompatibilities between original Snowball
stemmers (produced via Snowball compiler) and actual Lucene's Snowball
stemmers [2].

* different constructor of Among class: new Among("ce", -1, 1) vs. new Among
("ce", -1, -1, "", methodObject)
* in the find_among_b() method only two params are accepted

What is the procedure for producing Lucene-compatible stemmers from SBL
file? Is there any automation or should I modify that original compiled file


[1] It is actually a Czech stemmer, see, eventhough the original
author has stated in LUCENE-3883: I wouldn't recommend the aggressive mode,
and I regret that I left it uncommented. If you really think an alternative
would be welcome, it would be quite easy to get the best of both (in fact, I
spent roughly half the time on that trying to beat Snowball into
overstemming to match the original).

[2] Lucene stemmers can be found here:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message