harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Yu <junjie0...@gmail.com>
Subject [classlib][util]Scanner behaves differently with RI while parsing specific pattern
Date Tue, 10 Feb 2009 11:43:35 GMT
Hi all,
I found a behavior difference of Scanner between Harmony and RI. Here is a
simple testcase[1].
RI will return a successful match result "   *" while Harmony would fail to
find a match and return
null. I looked into code and found the root cause why Harmony fails to find
a match was that the
Scanner would ignore the next line terminator completely while trying to
find a match. According
to the Spec for findInLine(Pattern) method, this method "Attempts to find
the next occurrence of
the specified pattern ignoring delimiters." It seems our behavior of
ignoring the delimiter complies
with the Spec. But for the specific pattern in this case which contains a
special constructs'?='
which means a zero-width positive lookahead, RI's behavior indicates it
didn't ignore the delimiter
completely. In fact, according to the testcase result, RI would take the
delimiter into consideration
when it tries to find a match but exclude it in its match result. So it
seems the Spec is obscure for
the meaning of "ignore". To ignore the delimiter at all even when scanning
as Harmony does or just
ignore it in the match result ? RI's behavior indicates it means the later
one. So do we need to follow
RI's behavior?

I've raised a JIRA for this issue at
https://issues.apache.org/jira/browse/HARMONY-6087
And I've also attached a patch to follow RI's behavior.



[1]
import java.util.Scanner;
import java.util.regex.Pattern;

public class SpecialPattern {

    private static final Pattern pattern =
Pattern.compile("^\\s*(?:\\*(?=[^/]))");

    public static void main(String[] args) {
        Scanner scn = new Scanner(" *\n");
        String found = scn.findInLine(pattern);
        System.out.print(found);
    }

}

Result of RI:
   *

Result of Harmony:
null

-- 
Best Regards,
Jim, Jun Jie Yu

China Software Development Lab, IBM

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message