Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 4541 invoked from network); 5 Oct 2005 17:48:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 5 Oct 2005 17:48:26 -0000 Received: (qmail 62591 invoked by uid 500); 5 Oct 2005 10:36:25 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 62565 invoked by uid 500); 5 Oct 2005 10:36:25 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 62554 invoked by uid 99); 5 Oct 2005 10:36:24 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Oct 2005 03:36:24 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [69.55.225.129] (HELO ehatchersolutions.com) (69.55.225.129) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Oct 2005 03:36:27 -0700 Received: by ehatchersolutions.com (Postfix, from userid 504) id C799613E2007; Wed, 5 Oct 2005 06:36:01 -0400 (EDT) Received: from [172.16.1.101] (va-71-48-138-146.dhcp.sprint-hsd.net [71.48.138.146]) by ehatchersolutions.com (Postfix) with ESMTP id ABF0B13E200A for ; Wed, 5 Oct 2005 06:35:55 -0400 (EDT) In-Reply-To: <001901c5c955$f05a9ac0$0201a8c0@local> References: <1973244387.1128436010719.JavaMail.jira@ajax.apache.org> <001901c5c955$f05a9ac0$0201a8c0@local> Mime-Version: 1.0 (Apple Message framework v734) X-Priority: 3 Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed Message-Id: <99B4213B-B995-48E1-AEA7-312E540527D7@ehatchersolutions.com> Content-Transfer-Encoding: 7bit From: Erik Hatcher Subject: Re: [jira] Created: (LUCENE-444) StandardTokenizer loses Korean characters Date: Wed, 5 Oct 2005 06:35:54 -0400 To: java-dev@lucene.apache.org X-Mailer: Apple Mail (2.734) X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on javelina X-Spam-Level: X-Virus-Checked: Checked by ClamAV on apache.org X-Old-Spam-Status: No, score=-5.8 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.0.1 X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Never mind.... I see Otis beat me to it. Erik On Oct 4, 2005, at 10:38 PM, Youngho Cho wrote: > Hello, > > Is there any plan to add this patch into lucene core ? > I am using CJKAnalyzer but I hope to switch to the StanadardAnalyzer. > > Thanks, > > Youngho > > ----- Original Message ----- > From: "Cheolgoo Kang (JIRA)" > To: > Sent: Tuesday, October 04, 2005 11:26 PM > Subject: [jira] Created: (LUCENE-444) StandardTokenizer loses > Korean characters > > > >> StandardTokenizer loses Korean characters >> ----------------------------------------- >> >> Key: LUCENE-444 >> URL: http://issues.apache.org/jira/browse/LUCENE-444 >> Project: Lucene - Java >> Type: Bug >> Components: Analysis >> Reporter: Cheolgoo Kang >> Priority: Minor >> >> >> While using StandardAnalyzer, exp. StandardTokenizer with Korean >> text stream, StandardTokenizer ignores the Korean characters. This >> is because the definition of CJK token in StandardTokenizer.jj >> JavaCC file doesn't have enough range covering Korean syllables >> described in Unicode character map. >> This patch adds one line of 0xAC00~0xD7AF, the Korean syllables >> range to the StandardTokenizer.jj code. >> >> -- >> This message is automatically generated by JIRA. >> - >> If you think it was sent incorrectly contact one of the >> administrators: >> http://issues.apache.org/jira/secure/Administrators.jspa >> - >> For more information on JIRA, see: >> http://www.atlassian.com/software/jira >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org >> For additional commands, e-mail: java-dev-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org