lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexandre Rafalovitch <arafa...@gmail.com>
Subject Re: Preferred Scema/Config for Chinese Language Cores?
Date Fri, 05 Dec 2014 03:14:46 GMT
I have a couple of links that may be useful, though I have not tried
Chinese indexing myself:
http://discovery-grindstone.blogspot.ca/ (12 articles on CJK!)
http://java.dzone.com/articles/indexing-chinese-solr

Also, may be worth checking out the commercial offering from
http://www.basistech.com/ - one of the big issues with Chinese is that
tokenization rules are mostly dictionary-based and commercial
dictionaries could be significantly better than free ones :-)

Regards,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 4 December 2014 at 22:07, Tom Zimmermann <zimm.tom.j@gmail.com> wrote:
> Hi ,
>
> We are setting up our first Chinese language index and our team has found
> multiple conflicting bits of information regarding the proper configuration
> for tokenizing, filtering etc. Does anyone out there have a good
> functioning example we could work from our some links with guidance.
>
> Thanks,
> Tom

Mime
View raw message