Return-Path: X-Original-To: apmail-lucene-dev-archive@www.apache.org Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 854499E43 for ; Tue, 20 Mar 2012 22:32:01 +0000 (UTC) Received: (qmail 78455 invoked by uid 500); 20 Mar 2012 22:32:00 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 78398 invoked by uid 500); 20 Mar 2012 22:32:00 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 78389 invoked by uid 99); 20 Mar 2012 22:32:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2012 22:32:00 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2012 22:31:59 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 40EFCCD806 for ; Tue, 20 Mar 2012 22:31:39 +0000 (UTC) Date: Tue, 20 Mar 2012 22:31:39 +0000 (UTC) From: "Koji Sekiguchi (Assigned) (JIRA)" To: dev@lucene.apache.org Message-ID: <2094624596.39052.1332282699267.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <156597766.35525.1332227384272.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Assigned] (LUCENE-3888) split off the spell check word and surface form in spell check dictionary MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LUCENE-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Koji Sekiguchi reassigned LUCENE-3888: -------------------------------------- Assignee: Koji Sekiguchi > split off the spell check word and surface form in spell check dictionary > ------------------------------------------------------------------------- > > Key: LUCENE-3888 > URL: https://issues.apache.org/jira/browse/LUCENE-3888 > Project: Lucene - Java > Issue Type: Improvement > Components: modules/spellchecker > Reporter: Koji Sekiguchi > Assignee: Koji Sekiguchi > Priority: Minor > Fix For: 3.6, 4.0 > > Attachments: LUCENE-3888.patch, LUCENE-3888.patch, LUCENE-3888.patch > > > The "did you mean?" feature by using Lucene's spell checker cannot work well for Japanese environment unfortunately and is the longstanding problem, because the logic needs comparatively long text to check spells, but for some languages (e.g. Japanese), most words are too short to use the spell checker. > I think, for at least Japanese, the things can be improved if we split off the spell check word and surface form in the spell check dictionary. Then we can use ReadingAttribute for spell checking but CharTermAttribute for suggesting, for example. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail: dev-help@lucene.apache.org