commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Neidhart <thomas.neidh...@gmail.com>
Subject Re: Anyone interested in regular expressions, again?
Date Mon, 02 Feb 2015 07:43:33 GMT
On 02/02/2015 03:25 AM, sebb wrote:
> I would not wish to move away from Java RE *unless* the RE syntax was
> the same *and* the implementation was better performing *and* the
> existing code suffered from poor performance.
> 
> It might be OK if the alternate implementation was missing some
> esoteric features, but I would be very wary of using any features that
> were not in the Java implementation.
> 
> The likelihood is that the Java implementation will (eventually)
> become more performant, at which point it would be useful to be able
> to revert to the Java version.
> That requires a high degree of compatibilty to reduce the work involved.
> 
> It might be more useful to produce a tool that detects inefficient RE
> usage and suggests improvements.

I just know re2 a bit, but it is a trade-off:

 * linear-time evaluation vs. some features (e.g. backreferences)

A comparison between different regular expression implementations can be
found here:

http://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines

I am pretty sure the regexp implementation in java will not change,
simply because of backwards compatibility reasons, but such a library
would be useful as in many cases you do not need these additional
features but want to ensure that your regular expression will be
evaluated in linear time.

Thomas

> 
> 
> On 1 February 2015 at 22:35, James Carman <james@carmanconsulting.com> wrote:
>> To be clear, I am not advocating this approach.  I was merely trying to
>> illustrate what a nightmare such an endeavor would be. :)
>>
>> On Sunday, February 1, 2015, James Carman <james@carmanconsulting.com>
>> wrote:
>>
>>> You would basically have to pick a canonical regex language if you want a
>>> facade and be able to swap the regex library out.  Most of them are very
>>> similar but they are not the same.
>>>
>>> On Sunday, February 1, 2015, Gary Gregory <garydgregory@gmail.com
>>> <javascript:_e(%7B%7D,'cvml','garydgregory@gmail.com');>> wrote:
>>>
>>>> I think we'll need some clear performance advantages documented as well as
>>>> any compatibility issues.
>>>>
>>>> This begs for a facade API IMO. I would not want to recode my app just to
>>>> test one vs. the other, it should be pluggable.
>>>>
>>>> Gary
>>>>
>>>> On Sat, Jan 31, 2015 at 10:58 AM, Benson Margulies <bimargulies@gmail.com
>>>>>
>>>> wrote:
>>>>
>>>>> So, once upon a time, there was a regex library here. It was retired,
>>>>> presumably on the grounds that it was rendered obsolete by the JRE's
>>>>> native support.
>>>>>
>>>>> However, the JRE's regular expressions have a pretty severe problem;
>>>>> they have unbounded (or at least, very, very, bad) execution time for
>>>>> some combinations of data and regex.
>>>>>
>>>>> To cope with this, we ported the Henry Spencer regular expression
>>>>> library (as found in TCL) from C to Java.
>>>>>
>>>>> Thus: https://github.com/basis-technology-corp/tcl-regex-java
>>>>>
>>>>> Is anyone interested in this? Give or take the possible IP muddle of
>>>>> the original C Code, I could grant it easily.
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> E-Mail: garydgregory@gmail.com | ggregory@apache.org
>>>> Java Persistence with Hibernate, Second Edition
>>>> <http://www.manning.com/bauer3/>
>>>> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
>>>> Spring Batch in Action <http://www.manning.com/templier/>
>>>> Blog: http://garygregory.wordpress.com
>>>> Home: http://garygregory.com/
>>>> Tweet! http://twitter.com/GaryGregory
>>>>
>>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message