lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Moen (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3915) Add Japanese filter to replace term attribute with readings
Date Sun, 25 Mar 2012 10:59:27 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237826#comment-13237826
] 

Christian Moen commented on LUCENE-3915:
----------------------------------------

Thanks, Robert.

I'm thinking it could be useful to expand this filter with an option that controls how the
reading is actually being used.  I see two primary cases for this:

# Use the reading as the term attribute
# Use the reading as a synonym

The latter option can be useful for certain applications where we'd like to be able to search
by reading and get kanji matches.

Expanding further on this scenario, we would then probably want to support readings in several
scripts:

* Romaji (Hepburn)
* Hiragana
* Katakana

This filter should be optional and available as tool to support these applications -- and
of course the on-going Japanese spell-check work.

Thoughts?
                
> Add Japanese filter to replace term attribute with readings
> -----------------------------------------------------------
>
>                 Key: LUCENE-3915
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3915
>             Project: Lucene - Java
>          Issue Type: New Feature
>            Reporter: Christian Moen
>            Priority: Minor
>         Attachments: LUCENE-3915.patch, LUCENE-3915.patch
>
>
> Koji and Robert are working on LUCENE-3888 that allows spell-checkers to do their similarity
matching using a different word than its surface form.
> This approach is very useful for languages such as Japanese where the surface form and
the form we'd like to use for similarity matching is very different.  For Japanese, it's useful
to use readings for this -- probably with some normalization.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message