jakarta-oro-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Takashi Okamoto" <tokam...@rd.nttdata.co.jp>
Subject Re: bug with CP1252 characters and Perl5Matcher
Date Fri, 01 Dec 2000 01:49:38 GMT
>Special characters in the CP1252 character set (but not in ISO Latin-1)
>cause a ArrayIndexOutOfBoundsException deep within Perl5Matcher.  The
>problem occurs if I use "[^.]*\." as the pattern but not if I use
>"Test" as the pattern.  The characters are fancy forms of apostrophe
>and double-quotes (decimal 146, 147, and 148).  Use IE 5 to view the
>test files to see what they look like.  I am running Jakarta ORO 2.0,
>JDK 1.2.2, WinNT 4.0sp5.

Could you read Jakarta ORO 2.0.1 TODO file?

It says,

o Make Perl5 character classes (e.g., [abcde...]) fully support Unicode
  input.  Currently character classes only match 8-bit characters.

I posted a patch for this problem.
You can use this patch for temporaly.
May be this patch consumes much  memory (about 8k byte).
Read attached file.
Takashi Okamoto

View raw message