lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Tosovsky" <j.tosov...@email.cz>
Subject A new Snowball stemmer
Date Sun, 01 Oct 2017 20:53:51 GMT
Dear All,

I'd like to integrate a new Snowball stemmer [1] to Lucene for my
experiments, but I can see some incompatibilities between original Snowball
stemmers (produced via Snowball compiler) and actual Lucene's Snowball
stemmers [2].

Especially:
* different constructor of Among class: new Among("ce", -1, 1) vs. new Among
("ce", -1, -1, "", methodObject)
* in the find_among_b() method only two params are accepted

What is the procedure for producing Lucene-compatible stemmers from SBL
file? Is there any automation or should I modify that original compiled file
manually?

Thanks,
Jan

_________
[1] It is actually a Czech stemmer, see
https://issues.apache.org/jira/browse/LUCENE-4042, eventhough the original
author has stated in LUCENE-3883: I wouldn't recommend the aggressive mode,
and I regret that I left it uncommented. If you really think an alternative
would be welcome, it would be quite easy to get the best of both (in fact, I
spent roughly half the time on that trying to beat Snowball into
overstemming to match the original).

[2] Lucene stemmers can be found here:
https://github.com/apache/lucene-solr/tree/master/lucene/analysis/common/src
/java/org/tartarus/snowball/ext


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message