lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] Created: (SOLR-1336) Add support for lucene's SmartChineseAnalyzer
Date Wed, 05 Aug 2009 10:37:14 GMT
Add support for lucene's SmartChineseAnalyzer
---------------------------------------------

                 Key: SOLR-1336
                 URL: https://issues.apache.org/jira/browse/SOLR-1336
             Project: Solr
          Issue Type: New Feature
          Components: Analysis
            Reporter: Robert Muir


SmartChineseAnalyzer was contributed to lucene, it indexes simplified chinese text as words.

if the factories for the tokenizer and word token filter are added to solr it can be used,
although there should be a sample config or wiki entry showing how to apply the built-in stopwords
list.
this is because it doesn't contain actual stopwords, but must be used to prevent indexing
punctuation... 

note: we did some refactoring/cleanup on this analyzer recently, so it would be much easier
to do this after the next lucene update.
it has also been moved out of -analyzers.jar due to size, and now builds in its own smartcn
jar file, so that would need to be added if this feature is desired.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message