lucenenet-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shubhanshu Pathak <shubhanshupatha...@gmail.com>
Subject Using French Analyzer
Date Thu, 05 Mar 2015 18:36:46 GMT
Dear Group Members,

I am using Lucene.Net 3.0.3

In one of my projects I have to do language based analysis.

When I was trying to use already in place analyzer for the French language
FrenchAnalyzer, I came to know the fact that internally it uses
FrenchStemFilter.
The documentation of this Filter says that "Don't use me" -

       This stemmer does not implement the Snowball algorithm correctly,

        especially involving case problems.
It is recommended that you consider using the "French" stemmer in the
        snowball package instead.
This stemmer will likely be deprecated in a future release.

This means I should not use this Analyzer.

Then I tried using on SnowballAnalyzer. It provides me a way to do
linguistic
analysis through

Analyzer analyzer = new SnowballAnalyzer(Version.LUCENE_30, "French");

Now when I look at the code of the SnowballAnalyzer -

In it's constructor it invokes a method

SetOverridesTokenStreamMethod<SnowballAnalyzer>();

of the base class Analyzer.
This method is already marked as obsolete.

[Obsolete("This is only present to preserve back-compat of classes that
subclass a core analyzer and override tokenStream but not
reusableTokenStream ")]
protected internal virtual void SetOverridesTokenStreamMethod<TClass>()

This means we can not use the SnowballAnalyzer as well for a long run.


So kindly let me know how to achieve the linguistic analysis in such cases
apart from building our own Analyzer.

Thanks & Regards,
Shubhanshu

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message