jakarta-regexp-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holger Stratmann <Hol...@cheerful.com>
Subject Re: - IndexOutOfBoundsException: clarification
Date Mon, 13 May 2002 15:24:34 GMT
Actually, you can write much simpler RE's to reproduce this :-))

I had wanted to file a bugreport (along with a few others):

RegExp does not "support" more than 16 parenthesized sub-expressions.
As soon as you have more than 16 '(...)', you get ArrayIndexOOBExceptions :-(
(Actually, I had seen that while taking a look at the sources and then confirmed the problem
by trying it ;)

That's why your two expressions work separately, but not combined.

I guess I'll write a fix for that, but considering i didn#t have time to file a bugreport...

A "workaround" in this case (just as a temporary help for Michael):
Your RE has two clearly defined parts... You can probably use one more general expression
to find potential matches and then check two parts separately. Not nice...
Fixing the problem may actually be faster :-))
I had an estimate of 1-3 hours for fixing the code, but I'd need to find out something about
the process [of submitting code] first and that would probably take longer...

Cheers,
           Holger


bugzilla@apache.org wrote:

> DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG
> RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
> <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9035>.
> ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND
> INSERTED IN THE BUG DATABASE.
>
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=9035
>
> big Latitude Longitude RE causes IndexOutOfBoundsException
>
>            Summary: big Latitude Longitude RE causes
>                     IndexOutOfBoundsException
>            Product: Regexp
>            Version: unspecified
>           Platform: All
>         OS/Version: Linux
>             Status: NEW
>           Severity: Major
>           Priority: Other
>          Component: Other
>         AssignedTo: regexp-dev@jakarta.apache.org
>         ReportedBy: mnewcomb@tacintel.com
>
> I have two faily big REs dealing with Latitude and Longitude.  When I use them
> separately, no problems.  However, when I combine the 2 REs, so I can pass one
> Latitude-Longitude string to it, it bombs out with an exception (detailed
> below).
>
> Here is the test program.  Refer to the example run for usage:
>
> import java.io.*;
> import java.util.*;
> import org.apache.regexp.*;
>
> public class LatLonREBug
> {
>   private static final String LATITUDE_RE_STRING =
>
> "-?(([0-8]?[0-9]((\\.[0-9]+)|((([0-5][0-9])|60)((([0-5][0-9])|60))?))?)|90)[nNsS]";
>   private static final String LONGITUDE_RE_STRING =
>
> "-?(((([0-9]?[0-9])|(1[0-7][0-9]))((\\.[0-9]+)|((([0-5][0-9])|60)(([0-5][0-9])|60)?))?)|180)[eEwW]";
>
>   public static final String LATITUDE_LONGITUDE_RE_STRING =
>     "^" + LATITUDE_RE_STRING + LONGITUDE_RE_STRING + "$";
>
>   public static void main(String[] args)
>     throws Throwable
>   {
>     RE latlonRE = new RE(LATITUDE_LONGITUDE_RE_STRING);
>     System.out.println("LATITUDE_LONGITUDE_RE_STRING: " +
>                        LATITUDE_LONGITUDE_RE_STRING);
>
>     RE latRE = new RE("^" + LATITUDE_RE_STRING + "$");
>     System.out.println("LATITUDE_RE_STRING: " + LATITUDE_RE_STRING);
>
>     RE lonRE = new RE("^" + LONGITUDE_RE_STRING + "$");
>     System.out.println("LONGITUDE_RE_STRING: " + LONGITUDE_RE_STRING);
>
>     BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
>     String line = br.readLine();
>     while (line != null && !line.equals("quit") && !line.equals("exit"))
>     {
>       StringTokenizer st = new StringTokenizer(line);
>       int tokens = st.countTokens();
>
>       if (tokens > 1)
>       {
>         String command = st.nextToken();
>
>         if (command.equalsIgnoreCase("lat"))
>         {
>           String lat = st.nextToken();
>           latRE.match(lat);
>           System.out.println(lat + " is a properly formatted latitude");
>         }
>         else if (command.equalsIgnoreCase("lon"))
>         {
>           String lon = st.nextToken();
>           lonRE.match(lon);
>           System.out.println(lon + " is a properly formatted longitude");
>         }
>         else if (command.equalsIgnoreCase("latlon"))
>         {
>           String latlon = st.nextToken();
>           latlonRE.match(latlon);
>           System.out.println(latlon + " is a properly formatted lat-lon");
>         }
>         else
>         {
>           System.out.println("unknown command: " + command);
>         }
>       }
>       else
>       {
>         System.out.println("invalid line: " + line);
>       }
>
>       line = br.readLine();
>     }
>   }
> }
>
> Here is an example run of the test-case.  As you will see, when just doing
> latitude or longitude, the REs match as expected.  But, when I do a 'latlon'
> string, it pukes...
>
> [mnewcomb@localhost sandbox]$ java -classpath
> /usr/local/regexp/jakarta-regexp-1.2.jar:. LatLonREBug
> LATITUDE_LONGITUDE_RE_STRING:
> ^-?(([0-8]?[0-9]((\.[0-9]+)|((([0-5][0-9])|60)((([0-5][0-9])|60))?))?)|90)[nNsS]-?(((([0-9]?[0-9])|(1[0-7][0-9]))((\.[0-9]+)|((([0-5][0-9])|60)(([0-5][0-9])|60)?))?)|180)[eEwW]$
> LATITUDE_RE_STRING:
> -?(([0-8]?[0-9]((\.[0-9]+)|((([0-5][0-9])|60)((([0-5][0-9])|60))?))?)|90)[nNsS]
> LONGITUDE_RE_STRING:
> -?(((([0-9]?[0-9])|(1[0-7][0-9]))((\.[0-9]+)|((([0-5][0-9])|60)(([0-5][0-9])|60)?))?)|180)[eEwW]
> lat 55N
> 55N is a properly formatted latitude
> lat 55.454N
> 55.454N is a properly formatted latitude
> lat 5545N
> 5545N is a properly formatted latitude
> lon 123E
> 123E is a properly formatted longitude
> lon 5E
> 5E is a properly formatted longitude
> lon 123.444E
> 123.444E is a properly formatted longitude
> lon 1784532W
> 1784532W is a properly formatted longitude
> latlon 55N44E
> 55N44E is a properly formatted lat-lon
> latlon 55N44.33E
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException
>         at org.apache.regexp.RE.getParenEnd(RE.java:724)
>         at org.apache.regexp.RE.matchNodes(RE.java:942)
>         at org.apache.regexp.RE.matchNodes(RE.java:933)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:910)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:910)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:933)
>         at org.apache.regexp.RE.matchNodes(RE.java:933)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:910)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:910)
>         at org.apache.regexp.RE.matchNodes(RE.java:910)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:910)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:933)
>         at org.apache.regexp.RE.matchNodes(RE.java:933)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:910)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchNodes(RE.java:910)
>         at org.apache.regexp.RE.matchNodes(RE.java:1376)
>         at org.apache.regexp.RE.matchAt(RE.java:1448)
>         at org.apache.regexp.RE.match(RE.java:1498)
>         at org.apache.regexp.RE.match(RE.java:1468)
>         at org.apache.regexp.RE.match(RE.java:1561)
>         at LatLonREBug.main(LatLonREBug.java:54)
> [mnewcomb@localhost sandbox]$
>
> Any help will be greatly appreciated.
>
> Thanks,
> Michael
>
> --
> To unsubscribe, e-mail:   <mailto:regexp-dev-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:regexp-dev-help@jakarta.apache.org>


--
To unsubscribe, e-mail:   <mailto:regexp-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:regexp-dev-help@jakarta.apache.org>


Mime
View raw message