commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Benson <gudnabr...@gmail.com>
Subject Re: svn commit: r1407141 - /commons/proper/lang/trunk/src/main/java/org/apache/commons/lang3/StringUtils.java
Date Fri, 09 Nov 2012 21:46:04 GMT
I think we're mostly good, but just to make sure there's no confusion:

On Fri, Nov 9, 2012 at 3:05 PM, Jörg Schaible <joerg.schaible@gmx.de> wrote:

> Hi Matt,
>
> Matt Benson wrote:
>
> > On Fri, Nov 9, 2012 at 1:53 AM, Jörg Schaible
> > <Joerg.Schaible@scalaris.com>wrote:
> >
> >> Hi Greg,
> >>
> >> the pattern matches (also) a single space that gets replaced by a single
> >> space. Therefor are most of the actual performed replacements completely
> >> superfluous, since I expect this to be the common case. The pattern
> >> should be something along "[\\s&&[^ ]]\\s*".
> >>
> >>
> > That seems to say "a whitespace character that is not a space, optionally
> > followed by any number of whitespace characters."
>
> Right.
>
> > Wouldn't this
> > necessarily preclude any block of whitespace that *does* begin with a
> > space?
>
> Fine in the context where the pattern is actually used, since the matching
> string is trimmed first.
>

I don't agree; the string against which the pattern is applied is trimmed,
but this still won't catch e.g. SPACE TAB embedded in non-whitespace.


>
> > This does seem to be on the right track, however.  What about:
> >
> > "(?: \\s|[\\s&&[^ ]])\\s*"
> >
> > This seems to do the right thing:  beginning with a non-capturing group
> > that matches { EITHER a space followed by a whitespace character OR a
> > whitespace character that is not a space }, optionally followed by any
> > number of whitespace characters.
>
> IMHO the capturing group is not necessary here.
>
>
*non*-capturing.  I agree, it's not necessary, but AFAIK some form of
grouping is needed to separate our alternatives from the final \\s*, and if
I hadn't included ?: to mark the group as non-capturing, it would by
default have been a capturing group.  Since we'd never use the capture I
thought it less confusing to explicitly denote such.

br,
Matt


> Cheers,
> Jörg
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message