lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Youngho Cho" <youn...@nannet.co.kr>
Subject Re: [jira] Created: (LUCENE-444) StandardTokenizer loses Korean characters
Date Wed, 05 Oct 2005 04:58:05 GMT
Great !


Thanks a lot Otis and Cheolgoo.

Youngho

----- Original Message ----- 
From: "Otis Gospodnetic" <otis_gospodnetic@yahoo.com>
To: <java-dev@lucene.apache.org>; "Youngho Cho" <youngho@nannet.co.kr>
Sent: Wednesday, October 05, 2005 12:55 PM
Subject: Re: [jira] Created: (LUCENE-444) StandardTokenizer loses Korean characters


> Try the version from SVN, I just applied Cheolgoo's patch.
> 
> Otis
> 
> --- Youngho Cho <youngho@nannet.co.kr> wrote:
> 
> > Hello,
> > 
> > Is there any plan to add this patch into lucene core ?
> > I am using CJKAnalyzer but I hope to switch to the StanadardAnalyzer.
> > 
> > Thanks,
> > 
> > Youngho
> > 
> > ----- Original Message ----- 
> > From: "Cheolgoo Kang (JIRA)" <jira@apache.org>
> > To: <java-dev@lucene.apache.org>
> > Sent: Tuesday, October 04, 2005 11:26 PM
> > Subject: [jira] Created: (LUCENE-444) StandardTokenizer loses Korean
> > characters
> > 
> > 
> > > StandardTokenizer loses Korean characters
> > > -----------------------------------------
> > > 
> > >          Key: LUCENE-444
> > >          URL: http://issues.apache.org/jira/browse/LUCENE-444
> > >      Project: Lucene - Java
> > >         Type: Bug
> > >   Components: Analysis  
> > >     Reporter: Cheolgoo Kang
> > >     Priority: Minor
> > > 
> > > 
> > > While using StandardAnalyzer, exp. StandardTokenizer with Korean
> > text stream, StandardTokenizer ignores the Korean characters. This is
> > because the definition of CJK token in StandardTokenizer.jj JavaCC
> > file doesn't have enough range covering Korean syllables described in
> > Unicode character map.
> > > This patch adds one line of 0xAC00~0xD7AF, the Korean syllables
> > range to the StandardTokenizer.jj code.
> > > 
> > > -- 
> > > This message is automatically generated by JIRA.
> > > -
> > > If you think it was sent incorrectly contact one of the
> > administrators:
> > >    http://issues.apache.org/jira/secure/Administrators.jspa
> > > -
> > > For more information on JIRA, see:
> > >    http://www.atlassian.com/software/jira
> > > 
> > > 
> > >
> > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-dev-help@lucene.apache.org
Mime
View raw message