lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "SooMyung Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4956) the korean analyzer that has a korean morphological analyzer and dictionaries
Date Fri, 10 May 2013 01:08:13 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13653453#comment-13653453
] 

SooMyung Lee commented on LUCENE-4956:
--------------------------------------

Hi Christian,
Thanks for your great work.

I'd like to ask you to modify the text_kr field type definition in schema.xml as follows
{noformat}
    <fieldType name="text_kr" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.KoreanTokenizerFactory"/>
        <filter class="solr.KoreanFilterFactory hasOrigin="true" hasCNoun="true"  bigrammable="true""/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_kr.txt"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.KoreanTokenizerFactory"/>
        <filter class="solr.KoreanFilterFactory hasOrigin="false" hasCNoun="false"  bigrammable="false""/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_kr.txt"/>
      </analyzer>      
    </fieldType>
{noformat}
                
> the korean analyzer that has a korean morphological analyzer and dictionaries
> -----------------------------------------------------------------------------
>
>                 Key: LUCENE-4956
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4956
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>    Affects Versions: 4.2
>            Reporter: SooMyung Lee
>            Assignee: Christian Moen
>              Labels: newbie
>         Attachments: kr.analyzer.4x.tar
>
>
> Korean language has specific characteristic. When developing search service with lucene
& solr in korean, there are some problems in searching and indexing. The korean analyer
solved the problems with a korean morphological anlyzer. It consists of a korean morphological
analyzer, dictionaries, a korean tokenizer and a korean filter. The korean anlyzer is made
for lucene and solr. If you develop a search service with lucene in korean, It is the best
idea to choose the korean analyzer.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message