lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: Tokenizer and Filter Factory to index Chinese characters
Date Thu, 25 Jun 2015 09:17:08 GMT
Hello - you can use HMMChineseTokenizerFactory instead.
http://lucene.apache.org/core/5_2_0/analyzers-smartcn/org/apache/lucene/analysis/cn/smart/HMMChineseTokenizerFactory.html

-----Original message-----
> From:Zheng Lin Edwin Yeo <edwinyeozl@gmail.com>
> Sent: Thursday 25th June 2015 11:02
> To: solr-user@lucene.apache.org
> Subject: Tokenizer and Filter Factory to index Chinese characters
> 
> Hi,
> 
> Does anyone knows what is the correct replacement for these 2 tokenizer and
> filter factory to index chinese into Solr?
> - SmartChineseSentenceTokenizerFactory
> - SmartChineseWordTokenFilterFactory
> 
> I understand that these 2 tokenizer and filter factory are already
> deprecated in Solr 5.1, but I can't seem to find the correct replacement.
> 
> 
> <fieldType name="text_smartcn" class="solr.TextField"
> positionIncrementGap="0">
>           <analyzer type="index">
>             <tokenizer
> class="org.apache.lucene.analysis.cn.smart.SmartChineseSentenceTokenizerFactory"/>
>             <filter
> class="org.apache.lucene.analysis.cn.smart.SmartChineseWordTokenFilterFactory"/>
>           </analyzer>
>           <analyzer type="query">
>             <tokenizer
> class="org.apache.lucene.analysis.cn.smart.SmartChineseSentenceTokenizerFactory"/>
>             <filter
> class="org.apache.lucene.analysis.cn.smart.SmartChineseWordTokenFilterFactory"/>
>           </analyzer>
> </fieldType>
> 
> Thank you.
> 
> 
> Regards,
> Edwin
> 

Mime
View raw message