commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben McCann <...@benmccann.com>
Subject Re: Anyone interested in regular expressions, again?
Date Mon, 02 Feb 2015 22:35:40 GMT
That's be awesome James. I'd love to see re2j out in the open

On Mon, Feb 2, 2015 at 2:20 PM, James Ring <sjr@jdns.org> wrote:

> I spoke to one of the authors of re2j, a Google-internal port of the C++
> re2 library. The intention was to open source it but they just haven't got
> around to it.
>
> I may try and get Google to put re2j up on GitHub so you all can take a
> look. AFAIK it is heavily used in Google and it has an API that is largely
> compatible with java.util.regex. I know from personal experience that one
> can often benefit from re2j merely by replacing java.util.regex imports
> with the corresponding re2j imports.
>
> Regards,
> James
> On Feb 1, 2015 11:44 PM, "Thomas Neidhart" <thomas.neidhart@gmail.com>
> wrote:
>
> > On 02/02/2015 03:25 AM, sebb wrote:
> > > I would not wish to move away from Java RE *unless* the RE syntax was
> > > the same *and* the implementation was better performing *and* the
> > > existing code suffered from poor performance.
> > >
> > > It might be OK if the alternate implementation was missing some
> > > esoteric features, but I would be very wary of using any features that
> > > were not in the Java implementation.
> > >
> > > The likelihood is that the Java implementation will (eventually)
> > > become more performant, at which point it would be useful to be able
> > > to revert to the Java version.
> > > That requires a high degree of compatibilty to reduce the work
> involved.
> > >
> > > It might be more useful to produce a tool that detects inefficient RE
> > > usage and suggests improvements.
> >
> > I just know re2 a bit, but it is a trade-off:
> >
> >  * linear-time evaluation vs. some features (e.g. backreferences)
> >
> > A comparison between different regular expression implementations can be
> > found here:
> >
> > http://en.wikipedia.org/wiki/Comparison_of_regular_expression_engines
> >
> > I am pretty sure the regexp implementation in java will not change,
> > simply because of backwards compatibility reasons, but such a library
> > would be useful as in many cases you do not need these additional
> > features but want to ensure that your regular expression will be
> > evaluated in linear time.
> >
> > Thomas
> >
> > >
> > >
> > > On 1 February 2015 at 22:35, James Carman <james@carmanconsulting.com>
> > wrote:
> > >> To be clear, I am not advocating this approach.  I was merely trying
> to
> > >> illustrate what a nightmare such an endeavor would be. :)
> > >>
> > >> On Sunday, February 1, 2015, James Carman <james@carmanconsulting.com
> >
> > >> wrote:
> > >>
> > >>> You would basically have to pick a canonical regex language if you
> > want a
> > >>> facade and be able to swap the regex library out.  Most of them are
> > very
> > >>> similar but they are not the same.
> > >>>
> > >>> On Sunday, February 1, 2015, Gary Gregory <garydgregory@gmail.com
> > >>> <javascript:_e(%7B%7D,'cvml','garydgregory@gmail.com');>>
wrote:
> > >>>
> > >>>> I think we'll need some clear performance advantages documented
as
> > well as
> > >>>> any compatibility issues.
> > >>>>
> > >>>> This begs for a facade API IMO. I would not want to recode my app
> > just to
> > >>>> test one vs. the other, it should be pluggable.
> > >>>>
> > >>>> Gary
> > >>>>
> > >>>> On Sat, Jan 31, 2015 at 10:58 AM, Benson Margulies <
> > bimargulies@gmail.com
> > >>>>>
> > >>>> wrote:
> > >>>>
> > >>>>> So, once upon a time, there was a regex library here. It was
> retired,
> > >>>>> presumably on the grounds that it was rendered obsolete by
the
> JRE's
> > >>>>> native support.
> > >>>>>
> > >>>>> However, the JRE's regular expressions have a pretty severe
> problem;
> > >>>>> they have unbounded (or at least, very, very, bad) execution
time
> for
> > >>>>> some combinations of data and regex.
> > >>>>>
> > >>>>> To cope with this, we ported the Henry Spencer regular expression
> > >>>>> library (as found in TCL) from C to Java.
> > >>>>>
> > >>>>> Thus: https://github.com/basis-technology-corp/tcl-regex-java
> > >>>>>
> > >>>>> Is anyone interested in this? Give or take the possible IP
muddle
> of
> > >>>>> the original C Code, I could grant it easily.
> > >>>>>
> > >>>>>
> ---------------------------------------------------------------------
> > >>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > >>>>> For additional commands, e-mail: dev-help@commons.apache.org
> > >>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> E-Mail: garydgregory@gmail.com | ggregory@apache.org
> > >>>> Java Persistence with Hibernate, Second Edition
> > >>>> <http://www.manning.com/bauer3/>
> > >>>> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/>
> > >>>> Spring Batch in Action <http://www.manning.com/templier/>
> > >>>> Blog: http://garygregory.wordpress.com
> > >>>> Home: http://garygregory.com/
> > >>>> Tweet! http://twitter.com/GaryGregory
> > >>>>
> > >>>
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > > For additional commands, e-mail: dev-help@commons.apache.org
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> > For additional commands, e-mail: dev-help@commons.apache.org
> >
> >
>



-- 
about.me/benmccann

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message