jakarta-regexp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Swett <isw...@ispheres.com>
Subject Bugs I've found
Date Fri, 23 Feb 2001 23:16:55 GMT
I've found two bugs recently in regexp.  I'm new to the list, so I
apologize if these are known issues.

I wanted to notify the list of the problems I found, ensure they're
actually problems, and make sure I'm going about solving them in the
correct manner.

1) RECompiler dies when compiling regular expressions with '*?(' 
sequence of characters in the regexp.  Sometimes the next offset of a node
has not been set to zero, so when next = node + instruction[node +
offsetNext], next is very large, and you get an arrayoutOfBounds
exception.  I added a check to make sure there was no array out of bounds
case, and returned -1 in that case.  It appears to work, but there may be
a more correct way to fix this bug.

2) The other problem is with reluctant closures.  Because reluctant
closures are not recursive, cases like the following fail: b(aaa|aaaaa)*?b
does not accept baaaaaaaaaab (10 a's), when it should.  I have tried to
change around reluctant closures so they're implemented more similarly to
greedy ones(with recursive or's), but I don't have it working yet.  

Ian Swett

View raw message