ant-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephane Bailliez <sbaill...@imediation.com>
Subject RE: Ant Regexp wrappers [Re: multiline mode and platform issues]
Date Wed, 30 Jan 2002 16:55:50 GMT
> -----Original Message-----
> From: Stefan Bodewig [mailto:bodewig@apache.org]

> On Tue, 29 Jan 2002, Stephane Bailliez <sbailliez@apache.org> wrote:
> 
> > 1) Jakarta Oro works fine and is perfectly consistent.
> 
> Well, it works like Perl and works as advertised.  Whether this is
> "fine" or not, I don't want to judge.

At least the reason makes perfect sense...

> > 2) Jakarta RegExp use platform dependant line-separator and the
> > logic is not correct and will always return false on windows when
> > the \n immediately ends the string.... bug. see code below:
> 
> Time for you to nominate yourself as a commiter - there doesn't seem
> to be anybody else caring for it 8-(

I don't want to be a committer for this project. I perfectly understand that
Jon cannot be everywhere to fix everything in every project every time but
nothing has been done for the last 10 months here.

I have never used this package because it is too buggy and only Oro works
fine when dealing with unicode which is more than necessary in these days of
internationalization. RegExp was probably helpful before Oro was a Jakarta
project, but not anymore. If someone want lightweight regexp, Xerces also
has an API for regex that are even more 'lightweight' than regexp done by
the author of Regex4j (IBM folk).

> > 2) JDK 1.4 does not care about the option UNIX_LINE for $,
> 
> Ouch.

I reported the bug to Sun. They forgot to code a 'UnixDollar' node and
forgot to use '\u0085' in the code. It was also the occasion to see that I
will actually never use this part of the Java API considering the code in
there.
I also put a note on the bug report concerning the coding guidelines that
are all but following the Sun Coding Guidelines. The code is full of
if/loop/ with no block.

> So where are we here.  Put something like this into the documentation?

> If you use $ to anchor your regular expression on the end of a line
> and your line doesn't follow Unix conventions, you are on your own.
> Everything you do depends on the regular expression library you use,
> you better test your pattern on Unix and Windows platforms before you
> rely on it.  We apologize for the inconvenience.

I'm simply wondering if we should not drop support for RegExp and JDK 1.4.
1) there is no support for Jakarta RegExp.
2) JDK 1.4 regexp API is immature.
3) It makes it impossible using the above API to be consistent across
platforms.

Having dynamite like this in regexp replace or fileset or whatever will make
it terrible for bug reporting and maintenance and chance is that user will
simply say 'I use Ant and replace does not work.' Seriously it's impossible
to make the testcase we have pass under windows.

I will see if Xerces RegExp API can work consistently the Perl way. If yes,
then it can be a good replacement for RegExp and JDK 1.4.

Thoughts ?

Stephane

--
To unsubscribe, e-mail:   <mailto:ant-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:ant-dev-help@jakarta.apache.org>


Mime
View raw message