harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Liang <richard.lian...@gmail.com>
Subject Re: [jira] Commented: (HARMONY-688) java.util.regex.Matcher does not support Unicode supplementary characters
Date Thu, 29 Jun 2006 08:40:26 GMT


Nikolay Kuznetsov (JIRA) wrote:
>     [ http://issues.apache.org/jira/browse/HARMONY-688?page=comments#action_12418290
] 
>
> Nikolay Kuznetsov commented on HARMONY-688:
> -------------------------------------------
>
> Yes, we do not support supplementary characters. The main reason for this was that such
a support breaks quantifiers optimizations over character classes of fixed length(we support
1:-)). Now I think that I can support two different types of character classes: one for fixed
with 1(2), second for unknown(1 or 2, \\p{javaLowerCase}, for instance).
>
>   
Great! Now I'm eager for this function. Thanks a lot. ;-) 
> BTW, am I right that if we do not take into account unicode normalization support this
problem affects only character classes and ranges behaviour? 
Yes, I think so.
> In all the other cases it's impossible to construct such a pattern which will work incorrectly,
if not could you please give me an example.
>   
I'm not sure. At least, I cannot give the example. ;-)
> Thanks.
>    Nik.
>
>   
>> java.util.regex.Matcher does not support Unicode supplementary characters
>> -------------------------------------------------------------------------
>>
>>          Key: HARMONY-688
>>          URL: http://issues.apache.org/jira/browse/HARMONY-688
>>      Project: Harmony
>>         Type: Bug
>>     
>
>   
>>   Components: Classlib
>>     Reporter: Richard Liang
>>     
>
>   
>> Hello Nikolay,
>> The following test case pass on RI, but fail on Harmony.  Would you please have a
look at this issue? Thanks a lot.
>>     public void test_matcher() {
>>         Pattern p = Pattern.compile("\\p{javaLowerCase}");
>>         Matcher matcher = p.matcher("\uD801\uDC28");
>>         assertTrue(matcher.find());
>>     }
>> Best regards,
>> Richard
>>     
>
>   

-- 
Richard Liang
China Software Development Lab, IBM 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message