commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <>
Subject Re: [text] On the value of idempotent string escape methods?
Date Mon, 20 Feb 2017 15:30:32 GMT
On 20 February 2017 at 14:55, Rob Tompkins <> wrote:
>> On Feb 20, 2017, at 4:31 AM, sebb <> wrote:
>> On 19 February 2017 at 14:29, Raymond DeCampo < <>>
>>> I am trying to see how having the proposed unescape() method leads to an a
>>> useful escape method.
>>> E.g. clearly unescape("&amp;") would evaluate to "&".  So would
>>> unescape("&amp;amp;").  That means the proposed escape() method would also
>>> have the same output for "&amp;" and "&amp;amp;".
>>> I think a better approach for an idempotent escape would be to just
>>> unescape the string once, and then run the traditional escape.
>> That does not eliminate the problems, as you state below.
>>> You will
>>> still have issues if the user intended to escape the string "&amp;" but you
>>> are never going to crack that without some kind of state saving.
>> That is my exact point.
>> Since it's not possible for the function to work reliably, we should
>> not mislead users by pretending that there is a magic method that
>> works.
>>> Than given that the functionality is available via to consecutive calls to
>>> existing methods, I would probably be disinclined to include it in the
>>> library.
>> +1
> I’m a (+1) for removal as well.
> Also, I didn’t mean for my example to sound like a proposal. I merely was trying to
get to a potentially valuable stateless idempotent string escape function. Its contrivance
it quite clear.
> Any other comments out there?
> We could provide a stateful escaper (that figures out how many escapes a string is in),
or a method that returns the number of escapes in a string is. Again, I’m not all that sure
on the value of such methods.

I don't think it's possible to work out the number of times a string
has been escaped.

The most one can do is to determine if a string has not been escaped.
That would be the case where a string has one or more unescaped
characters in it.
For example "This & that" has obviously not been escaped.

However if a string has no un-escaped characters it it, that does not
necessarily mean that it has already been escaped.
For example: "This &amp; that".
This might have been escaped - or it might not.
For example it could be the answer to: "How does one code 'This &
that' in HTML?"

The application has to keep track of the escape-state of the string.

> Cheers,
> -Rob
>>> On Sat, Feb 18, 2017 at 12:04 PM, Rob Tompkins <> wrote:
>>>> In preparation for the 1.0 release, I think we should address Sebb's
>>>> concern in TEXT-40 about the attempt to create "idempotent" string escape
>>>> methods. By idempotent I mean someMethod("some string") =
>>>> someMethod(someMethod(someMethod(...someMethod("some string")))), a
>>>> single application of a method is equal to any number of the applications
>>>> of the method on the same input.
>>>> Below I lay out a mechanism by which it is possible to write such methods,
>>>> but I don’t know the value in writing such methods. I'm merely expressing
>>>> that idempotency is a possibility.
>>>> For string "un-escaping", I believe that we can write a method that,
>>>> indeed, is idempotent by simply running the un-escape method the finite
>>>> number of un-escapings to get to the point at which the string remains
>>>> unchanged between applications of the un-escaping method. (I believe that
>>>> can write a proof that all un-escape methods have such a point, if that is
>>>> needed for the sake of discussion).
>>>> If indeed we can create an idempotent un-escape method, then we can simply
>>>> take that method run it, and then run the escaping method one time. If we
>>>> always completely unescape and then escape once then we do have an
>>>> idempotent method.
>>>> Such a method might not be all that valuable to the user though.
>>>> Furthermore, this just explains one way to create such an idempotent
>>>> method. Whether or not more or more valuable methods exists, would take
>>>> some more though.
>>>> Anyone have any thoughts? My feeling is that it might be more effort than
>>>> it's worth to ensure that any string is only "singly encoded.” Further,
>>>> probably should give a look at the “escape_once” methods in
>>>> StringEsapeUtils.
>>>> Cheers
>>>> -Rob
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail:
>>>> For additional commands, e-mail:
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: <>
>> For additional commands, e-mail: <>

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message