lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michał Dybizbański (JIRA) <>
Subject [jira] [Updated] (LUCENE-2341) explore morfologik integration
Date Mon, 27 Jun 2011 20:19:47 GMT


Michał Dybizbański updated LUCENE-2341:

    Attachment: morfologik-polish-1.5.2.jar

David, as you suggested, I've changed the interface to MorfologikAnalyzer and MorfologikFilter
to account for the changes in Morfologik 1.5.2, namely the multiple dictionaries.
Both those classes' constructors now accept a PolishStemmer.DICTIONARY (instead of languageCode
String as in previous patch). A PolishStemmer object is instantiated by MorfologikFilter,
so each invocation of MorfologikAnalyzer.createComponents (which instantiates MorfologikFilter)
is coupled with an individual instance of PolishStemmer.
This way, sharing a MorfologikAnalyzer by separate threads is safe (even though MorfologikFilter
itself isn't thread-safe) provided each thread obtains its own TokenStreamComponents through
ReusableAnalyzerBase.createComponents (is this always the case ? looking at other filters,
thay don't look thread-safe neither ..)

> explore morfologik integration
> ------------------------------
>                 Key: LUCENE-2341
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: Robert Muir
>            Assignee: Dawid Weiss
>         Attachments: LUCENE-2341.diff, LUCENE-2341.diff, LUCENE-2341.diff, morfologik-fsa-1.5.2.jar,
morfologik-polish-1.5.2.jar, morfologik-stemming-1.5.0.jar, morfologik-stemming-1.5.2.jar
> Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer available:
> This works differently than LUCENE-2298, and ideally would be another option for users.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message