commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel F. Savarese" <...@savarese.org>
Subject Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Date Mon, 21 Jun 2004 16:24:13 GMT

In message <40D3DBBC.4070008@apache.org>, Mario Ivankovits writes:
>It looks like Perl and Java are very (very) simmilar. So asking ORO to 

The Java regex syntax is almost a superset of Perl, which is why I don't
see the impact of using a Perl engine for JDK 1.3 and java.util.regex
for J2SE 1.4 as being major.  The expression Rami gave was straight
Perl 5.005.  jakarta-oro's Perl5Compiler/Perl5Matcher implements
zero-width look-ahead assertions from Perl 5.003 but does not implement
the zero-width look-behind assertions from 5.005 and future versions (if
you don't ask for it ...).  This can be added.  The other difference is
that in Perl \Q and \E are not part of the regex syntax.  They are part
of Perl string handling, so we didn't implement them in Perl5Compiler
(instead quotmeta() is provided), but support them in the Perl5Util
convenience class.  This can be moved into Perl5Compiler if desired.
There has to be a user driver for these small things to happen.

In general, most regular expressions you see in the wild can be
simplified and don't require unusual constrcuts.  For example, why
write "\\Q**\\E" when "\\*\\*" will do (you would usually want to use
\Q and \E for longer sequences or for dynamically generated strings you
want to escape; but quotemeta works equally well)?  Why use a negative
look-behind assertion in ((?<!^)|[^/]) when [^/] will suffice (the
negative look-behind assertion is redundant because if there's a character
present that's not a slash, then it's not the start of the input)?  Of
course, you can't always simplify your expressions and I think Rami's point
is that you shouldn't be bothered with the finer points and stuff should
just work.  I think the answer is that as long as you stick to Perl5 syntax
(which most people using java.util.regex are unknowingly doing), you'll
rarely run into differences; but that oro doesn't implement most of the
stuff added after Perl 5.003 for lack of demand (there's not that much stuff).

daniel



---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message