harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anton Ivanov" <ant...@gmail.com>
Subject Re: [classlib][regex|luni] build break
Date Thu, 12 Oct 2006 13:33:39 GMT
I documented the details on both JIRA issues:
http://issues.apache.org/jira/browse/HARMONY-688
http://issues.apache.org/jira/browse/HARMONY-933
So, please mark these issues as non-bug-differences if needed.

Thanks,
Anton

On 10/12/06, Paulex Yang <paulex.yang@gmail.com> wrote:
>
> Anton Ivanov wrote:
> > The problem is in the RI. These failures are the RI bugs.
> >
> > The test failures on the RI you pointed out can be grouped into the two
> I guess you meant three ;-)
> > categories:
> Is category2, the supplemental character issue, included in the
> HARMONY-933? How about to document the details like below on that JIRA,
> and mark it as non-bug difference?
> >
> > 1. Canonical equivalence related.
> >
> > java.util.regex.PatternSyntaxException: Unclosed group near index 59
> > (?:ǠI|ǠI|ǠI|ȦĪ|ȦĪ|ȦĪ|ǠI|ǠI|Aİ̄(?:Ìc|Ìc|Ic̀)db(ac)
> > ^
> > at java.util.regex.Pattern.error(Pattern.java:1650)
> > at java.util.regex.Pattern.accept(Pattern.java:1508)
> > at java.util.regex.Pattern.group0(Pattern.java:2460)
> > at java.util.regex.Pattern.sequence(Pattern.java:1715)
> > at java.util.regex.Pattern.expr(Pattern.java:1687)
> > at java.util.regex.Pattern.compile(Pattern.java:1397)
> > at java.util.regex.Pattern.<init>(Pattern.java:1124)
> > at java.util.regex.Pattern.compile(Pattern.java:840)
> > at
> > org.apache.harmony.tests.java.util.regex.PatternTest.testCanonEqFlag(
> > PatternTest.java:1060)
> >
> > The RI fails to compile the following pattern with CANON_EQ flag
> > specified:
> >       "\u01E0\u00CCcdb(ac)"
> > This is due to the RI tries to build alternations to take into account
> > canonical equivalence.
> > And the RI does so in simple cases. But if pattern is a little more
> > complex the RI fails to compile it.
> > So the RI builds these alternations wrong.
> > You can see the following bug:
> > http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4867170
> >
> > I wrote about these test failures on the RI here:
> > http://issues.apache.org/jira/browse/HARMONY-933
> >
> > 2. Supplementary Unicode codepoints related.
> >
> > For example let's see at:
> >
> > testPredefinedClassesWithSurrogatesSupplementary
> > junit.framework.AssertionFailedError: null
> > at junit.framework.Assert.fail(Assert.java:47)
> > at junit.framework.Assert.assertTrue(Assert.java:20)
> > at junit.framework.Assert.assertFalse(Assert.java:34)
> > at junit.framework.Assert.assertFalse(Assert.java:41)
> > at
> >
> org.apache.harmony.tests.java.util.regex.PatternTest.testPredefinedClassesWithSurrogatesSupplementary
> >
> > (PatternTest.java:1477)
> >
> > Here we try to find surrogate character in a codepoint \uD916\uDE27.
> > It is written here:
> > http://www.unicode.org/reports/tr18/#Supplementary_Characters
> >
> > "Surrogate pairs (or their equivalents in other encoding forms) are be
> > handled internally as single code point values"
> >
> > So we have to treat text as code points not code units.
> > Here \uD916\uDE27 is a one code point consisting of
> > two code units (two surrogate characters) so we find nothing.
> > (I added a comment with this explanation to the
> > testPredefinedClassesWithSurrogatesSupplementary()).
> > But the RI doesn't treat this codepoint as a single whole, this is the
> RI
> > bug
> > and this is wrong according to the technical report.
> >
> > 3. Error messages
> > java.util.regex.PatternSyntaxException: unmatched ) near index: 1
> > b)a
> > ^
> > java.util.regex.PatternSyntaxException: unmatched ) near index: 4
> > bcde)a
> > ^
> > java.util.regex.PatternSyntaxException: unmatched ) near index: 5
> > bbg())a
> > ^
> > java.util.regex.PatternSyntaxException: unmatched ) near index: 7
> > cdb(?i))a
> > ^
> > are printed in the testCompileStringint().
> > This test is needed to verify that appropriate exceptions are thrown
> > if we compile a wrong builded regular expression.
> >
> > Thanks,
> > Anton
> >
> > On 10/12/06, Spark Shen <smallsmallorgan@gmail.com> wrote:
> >>
> >> Anton Ivanov 写道:
> >> > On 10/10/06, Anton Ivanov <antiva@gmail.com> wrote:
> >> >>
> >> >>
> >> >>
> >> >> On 10/10/06, Tim Ellison <t.p.ellison@gmail.com> wrote:
> >> >> >
> >> >> > So I checked in a patch for HARMONY-688's regex fix, and it passed
> >> the
> >> >> > regex unit tests, but causes the existing luni tests to fail in
> >> >> > java.util.Scanner. I've not figured out the base cause of the
> >> failure
> >> >> > so I've backed out the changes.
> >> >> >
> >> >> > Regards,
> >> >> > Tim
> >> >> >
> >> >> > --
> >> >> >
> >> >> > Tim Ellison (t.p.ellison@gmail.com )
> >> >> > IBM Java technology centre, UK.
> >> >> >
> >> >> >
> >> ---------------------------------------------------------------------
> >> >> > Terms of use : http://incubator.apache.org/harmony/mailing.html
> >> >> > To unsubscribe, e-mail:
> >> harmony-dev-unsubscribe@incubator.apache.org
> >> >> > For additional commands, e-mail:
> >> harmony-dev-help@incubator.apache.org
> >> >>
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> This is my patch.
> >> >> I'll look into this problem and try to correct the patch.
> >> >>
> >> >> Thanks,
> >> >> Anton
> >> >>
> >> > There was a bug in the newly created class SupplRangeSet.java.
> >> > There was the following code in the method matches() of
> >> > SupplRangeSet.java:
> >> > ...
> >> > if (stringIndex < strLength) {
> >> > char high = testString.charAt(stringIndex++);
> >> >
> >> > if (contains(high) &&
> >> > next.matches(stringIndex, testString, matchResult) > 0)
> >> > {
> >> > return 1;
> >> > }
> >> > ...
> >> > But it is wrong simply to return 1, though we can read about method
> >> > matches() in AbstractSet.java comments:
> >> >
> >> > "Checks if this node matches in given position and recursively call
> >> > next node matches on positive self match. Returns positive integer if
> >> > entire match succeed, negative otherwise
> >> > return -1 if match fails or n > 0;"
> >> > In fact method matches() returns not only a positive n > 0. The n
> >> is an
> >> > offset in case of a positive
> >> > match attempt. This fact is took into account in all old classes of
> >> > java.util.regex, but I forgot this fact in SupplRangeSet.java
> >> > So I corrected method matches() of the SupplRangeSet class as
> follows:
> >> > ...
> >> > int offset = -1;
> >> > if (stringIndex < strLength) {
> >> > char high = testString.charAt(stringIndex++);
> >> >
> >> > if (contains(high) &&
> >> > (offset = next.matches(stringIndex, testString,
> >> > matchResult)) > 0) {
> >> > return offset;
> >> > }
> >> > ...
> >> > I corrected the patch and attached it to the issue.
> >> > I verified that regex and luni tests pass normally with the patch
> >> > applied.
> >> >
> >> > Thanks,
> >> > Anton
> >> >
> >> Hi Anton:
> >> It must be very excited to handle such a complex problem. :-)
> >>
> >> But after applying the new patch (and test patch applied), I still got
> >> problems:
> >> Of test class: org.apache.harmony.tests.java.util.regex.PatternTest, 4
> >> test methods fail on RI:
> >> testCanonEqFlag:
> >> java.util.regex.PatternSyntaxException: Unclosed group near index 59
> >> (?:ǠI|ǠI|ǠI|ȦĪ|ȦĪ|ȦĪ|ǠI|ǠI|Aİ̄(?:Ìc|Ìc|Ic̀)db(ac)
> >> ^
> >> at java.util.regex.Pattern.error(Pattern.java:1650)
> >> at java.util.regex.Pattern.accept(Pattern.java:1508)
> >> at java.util.regex.Pattern.group0(Pattern.java:2460)
> >> at java.util.regex.Pattern.sequence(Pattern.java:1715)
> >> at java.util.regex.Pattern.expr(Pattern.java:1687)
> >> at java.util.regex.Pattern.compile(Pattern.java:1397)
> >> at java.util.regex.Pattern.<init>(Pattern.java:1124)
> >> at java.util.regex.Pattern.compile(Pattern.java:840)
> >> at
> >> org.apache.harmony.tests.java.util.regex.PatternTest.testCanonEqFlag(
> >> PatternTest.java:1060)
> >>
> >> testIndexesCanonicalEq:
> >> junit.framework.AssertionFailedError: null
> >> at junit.framework.Assert.fail(Assert.java:47)
> >> at junit.framework.Assert.assertTrue(Assert.java:20)
> >> at junit.framework.Assert.assertTrue(Assert.java:27)
> >> at
> >>
> >>
> org.apache.harmony.tests.java.util.regex.PatternTest.testIndexesCanonicalEq
> >>
> >> (PatternTest.java:1247)
> >>
> >> testCanonEqFlagWithSupplementaryCharacters:
> >> junit.framework.AssertionFailedError: null
> >> at junit.framework.Assert.fail(Assert.java:47)
> >> at junit.framework.Assert.assertTrue(Assert.java:20)
> >> at junit.framework.Assert.assertTrue(Assert.java:27)
> >> at
> >>
> >>
> org.apache.harmony.tests.java.util.regex.PatternTest.testCanonEqFlagWithSupplementaryCharacters
> >>
> >> (PatternTest.java:1275)
> >>
> >> testPredefinedClassesWithSurrogatesSupplementary
> >> junit.framework.AssertionFailedError: null
> >> at junit.framework.Assert.fail(Assert.java:47)
> >> at junit.framework.Assert.assertTrue(Assert.java:20)
> >> at junit.framework.Assert.assertFalse(Assert.java:34)
> >> at junit.framework.Assert.assertFalse(Assert.java:41)
> >> at
> >>
> >>
> org.apache.harmony.tests.java.util.regex.PatternTest.testPredefinedClassesWithSurrogatesSupplementary
> >>
> >> (PatternTest.java:1477)
> >> If they are the bugs of RI, shall we add comments for them in the test
> >> case?
> >>
> >> and Error message printed out on console on Harmony. Since there are
> >> test cases use System.out instead of assert, I could not locate where
> >> these error message comes from:
> >> java.util.regex.PatternSyntaxException: unmatched ) near index: 1
> >> b)a
> >> ^
> >> java.util.regex.PatternSyntaxException: unmatched ) near index: 4
> >> bcde)a
> >> ^
> >> java.util.regex.PatternSyntaxException: unmatched ) near index: 5
> >> bbg())a
> >> ^
> >> java.util.regex.PatternSyntaxException: unmatched ) near index: 7
> >> cdb(?i))a
> >> ^
> >> And last, the good news is luni tests do pass. :-)
> >>
> >> Best regards
> >>
> >> --
> >> Spark Shen
> >> China Software Development Lab, IBM
> >>
> >>
> >> ---------------------------------------------------------------------
> >> Terms of use : http://incubator.apache.org/harmony/mailing.html
> >> To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
> >> For additional commands, e-mail: harmony-dev-help@incubator.apache.org
> >>
> >>
>
>
> --
> Paulex Yang
> China Software Development Lab
> IBM
>
>
>
> ---------------------------------------------------------------------
> Terms of use : http://incubator.apache.org/harmony/mailing.html
> To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
> For additional commands, e-mail: harmony-dev-help@incubator.apache.org
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message