jakarta-regexp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 3773] - Problem with parsing greedy match modifiers
Date Mon, 06 Oct 2003 05:38:36 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=3773>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=3773

Problem with parsing greedy match modifiers





------- Additional Comments From son@sparc.spb.su  2003-10-06 05:38 -------
The minimal regexp to reproduce the problem is (a{2}b){2}.
Here is an output of RETest for this regexp:
Z:\src\regexp\jakarta-regexp\build>v.jar org.apache.regexp.RETest -i (a{2}b){2}

(a{2}b){2}

0. OP_BRANCH, opdata = 0, next = 40
3. OP_OPEN, opdata = 1, next = 6
6. OP_BRANCH, opdata = 0, next = 21
9. OP_ATOM, opdata = 1, next = 13, "a"
13. OP_ATOM, opdata = 1, next = 17, "a"
17. OP_ATOM, opdata = 1, next = 21, "b"
21. OP_CLOSE, opdata = 1, next = 24
24. OP_OPEN, opdata = 2, next = 27
27. OP_BRANCH, opdata = 0, next = 37
30. OP_NOTHING, opdata = 0, next = 33
33. OP_ATOM, opdata = 1, next = 37, "b"
37. OP_CLOSE, opdata = 2, next = 40
40. OP_END, opdata = 0, next = none

The cause of the problem is in algorithm which RECompiler uses to handle
<regexp>{n,m} construction.
It reduce n stored in bracketsMin array and restart parsing from begin of the 
regexp. But it doesn't clear barcketsXXX for nested constructions.
Thus when it next time finds one of nested brackets it thinks that it was {0, m}
and replaces appropriate atom by OP_NOTHING.

---------------------------------------------------------------------
To unsubscribe, e-mail: regexp-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: regexp-dev-help@jakarta.apache.org


Mime
View raw message