lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Rowe <sar...@gmail.com>
Subject Re: Looking for example code: Tokenizer + Analyzer for Russian stemming
Date Wed, 19 Dec 2012 14:22:01 GMT
Hi Dima,

The example code you mentioned in your other recent email is pretty close.

The only thing you'd probably want to add is access to the CharTermAttribute:

    CharTermAttribute termAtt = addAttribute(CharTermAttribute.class);

and then in the loop over ts.incrementToken(), you can get to the output tokens using termAtt.buffer()
and termAtt.length(), or if you're going to Stringify tokens anyway, termAtt.toString().

Steve

On Dec 18, 2012, at 1:16 PM, dokondr <dokondr@gmail.com> wrote:

> Hello,
> I am looking for an example of using Tokenizer + Analyzer (in particular
> org.apache.lucene.analysis.ru.RussianAnalyzer) for standalone stemming.
> Can't find such an example here:
> http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/analysis/package-summary.html?is-external=true#package_description
> 
> Thanks!


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message