lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: [jira] Created: (LUCENE-444) StandardTokenizer loses Korean characters
Date Wed, 05 Oct 2005 03:55:11 GMT
Try the version from SVN, I just applied Cheolgoo's patch.

Otis

--- Youngho Cho <youngho@nannet.co.kr> wrote:

> Hello,
> 
> Is there any plan to add this patch into lucene core ?
> I am using CJKAnalyzer but I hope to switch to the StanadardAnalyzer.
> 
> Thanks,
> 
> Youngho
> 
> ----- Original Message ----- 
> From: "Cheolgoo Kang (JIRA)" <jira@apache.org>
> To: <java-dev@lucene.apache.org>
> Sent: Tuesday, October 04, 2005 11:26 PM
> Subject: [jira] Created: (LUCENE-444) StandardTokenizer loses Korean
> characters
> 
> 
> > StandardTokenizer loses Korean characters
> > -----------------------------------------
> > 
> >          Key: LUCENE-444
> >          URL: http://issues.apache.org/jira/browse/LUCENE-444
> >      Project: Lucene - Java
> >         Type: Bug
> >   Components: Analysis  
> >     Reporter: Cheolgoo Kang
> >     Priority: Minor
> > 
> > 
> > While using StandardAnalyzer, exp. StandardTokenizer with Korean
> text stream, StandardTokenizer ignores the Korean characters. This is
> because the definition of CJK token in StandardTokenizer.jj JavaCC
> file doesn't have enough range covering Korean syllables described in
> Unicode character map.
> > This patch adds one line of 0xAC00~0xD7AF, the Korean syllables
> range to the StandardTokenizer.jj code.
> > 
> > -- 
> > This message is automatically generated by JIRA.
> > -
> > If you think it was sent incorrectly contact one of the
> administrators:
> >    http://issues.apache.org/jira/secure/Administrators.jspa
> > -
> > For more information on JIRA, see:
> >    http://www.atlassian.com/software/jira
> > 
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message