lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: A new Snowball stemmer
Date Sun, 01 Oct 2017 21:27:51 GMT
Hi,

there is an ANT task for patching the Snowball Compiler output.

lucene/analysis/common/ $ ant patch-snowball

I am not 100% sure if this still works with latest snowball compiler, but back at that time
it was used to convert the files. You may need to use an older Snowball version, so the regexes
work.

Uwe

-----
Uwe Schindler
Achterdiek 19, D-28357 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Jan Tosovsky [mailto:j.tosovsky@email.cz]
> Sent: Sunday, October 1, 2017 10:54 PM
> To: java-user@lucene.apache.org
> Subject: A new Snowball stemmer
> 
> Dear All,
> 
> I'd like to integrate a new Snowball stemmer [1] to Lucene for my
> experiments, but I can see some incompatibilities between original Snowball
> stemmers (produced via Snowball compiler) and actual Lucene's Snowball
> stemmers [2].
> 
> Especially:
> * different constructor of Among class: new Among("ce", -1, 1) vs. new
> Among
> ("ce", -1, -1, "", methodObject)
> * in the find_among_b() method only two params are accepted
> 
> What is the procedure for producing Lucene-compatible stemmers from SBL
> file? Is there any automation or should I modify that original compiled file
> manually?
> 
> Thanks,
> Jan
> 
> _________
> [1] It is actually a Czech stemmer, see
> https://issues.apache.org/jira/browse/LUCENE-4042, eventhough the
> original
> author has stated in LUCENE-3883: I wouldn't recommend the aggressive
> mode,
> and I regret that I left it uncommented. If you really think an alternative
> would be welcome, it would be quite easy to get the best of both (in fact, I
> spent roughly half the time on that trying to beat Snowball into
> overstemming to match the original).
> 
> [2] Lucene stemmers can be found here:
> https://github.com/apache/lucene-
> solr/tree/master/lucene/analysis/common/src
> /java/org/tartarus/snowball/ext
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message