lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: [jira] Created: (LUCENE-444) StandardTokenizer loses Korean characters
Date Wed, 05 Oct 2005 10:35:54 GMT
Never mind.... I see Otis beat me to it.

     Erik


On Oct 4, 2005, at 10:38 PM, Youngho Cho wrote:

> Hello,
>
> Is there any plan to add this patch into lucene core ?
> I am using CJKAnalyzer but I hope to switch to the StanadardAnalyzer.
>
> Thanks,
>
> Youngho
>
> ----- Original Message -----
> From: "Cheolgoo Kang (JIRA)" <jira@apache.org>
> To: <java-dev@lucene.apache.org>
> Sent: Tuesday, October 04, 2005 11:26 PM
> Subject: [jira] Created: (LUCENE-444) StandardTokenizer loses  
> Korean characters
>
>
>
>> StandardTokenizer loses Korean characters
>> -----------------------------------------
>>
>>          Key: LUCENE-444
>>          URL: http://issues.apache.org/jira/browse/LUCENE-444
>>      Project: Lucene - Java
>>         Type: Bug
>>   Components: Analysis
>>     Reporter: Cheolgoo Kang
>>     Priority: Minor
>>
>>
>> While using StandardAnalyzer, exp. StandardTokenizer with Korean  
>> text stream, StandardTokenizer ignores the Korean characters. This  
>> is because the definition of CJK token in StandardTokenizer.jj  
>> JavaCC file doesn't have enough range covering Korean syllables  
>> described in Unicode character map.
>> This patch adds one line of 0xAC00~0xD7AF, the Korean syllables  
>> range to the StandardTokenizer.jj code.
>>
>> -- 
>> This message is automatically generated by JIRA.
>> -
>> If you think it was sent incorrectly contact one of the  
>> administrators:
>>    http://issues.apache.org/jira/secure/Administrators.jspa
>> -
>> For more information on JIRA, see:
>>    http://www.atlassian.com/software/jira
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message