httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rainer Jung <rainer.j...@kippdata.de>
Subject Re: Expression Parser: search and replace with s/PATTERN/REPLACEMENT/FLAGS
Date Sun, 04 Oct 2015 10:46:14 GMT
Am 04.10.2015 um 10:23 schrieb Stefan Fritsch:
> On Thursday 01 October 2015 13:55:40, Rainer Jung wrote:
>> Am 01.10.2015 um 12:31 schrieb Graham Leggett:
>>> On 01 Oct 2015, at 12:26 PM, Rainer Jung <rainer.jung@kippdata.de>
> wrote:
>>>> Since it gets more common to use the expression parser for string
>>>> operations and not only for boolean checks, I think it would be
>>>> useful (and powerful) to support
>>>>
>>>> s/PATTERN/REPLACEMENT/FLAGS
>>>>
>>>> and allow back references in REPLACEMENT. The operation would not
>>>> try to do the replacement in place but create a new string
>>>> according to the given PATTERN and REPLACEMENT.
>>>>
>>>> I had a quick look at the flex and bison files which generate
>>>> lexer and parser but must admit that it wasn't immediately
>>>> obvious to me how to do it. I can try harder but first wanted to
>>>> ask if there are any volunteers who know that technology better
>>>> than me. Stefan (Frisch)? Others?
>
> I don't have much time for hacking httpd. But I will take a look.
>
>>>> Otherwise I'll try myself (and learn new stuff on the way).
>>>
>>> We currently support a variation of this like this:
>>>     <LocationMatch /path/(?<PATHNAME>[^/]+)>
>>>
>>>       SomeDirective %{env:MATCH_PATHNAME}
>>>
>>>     </LocationMatch>
>>>
>>> Not sure if that’s what you had in mind, or if you’re trying to
>>> achieve something different?
>> Something different. Example:
>>
>> Header set X-USER "expr=%{REMOTE_USER} =~ s/([^@]*)@.*/$1/"
>>
>> So the string result of a s/PATTERN/REPLACEMENT/ should be the
>> resulting REPLACEMENT string (if REMOTE_USER is "name@domain", the
>> header value would be "name").
>
> This is a bit complicated because you would probably not want this to
> be a general part of the syntax of a string but rather a special case
> in the "whole expression returns string" mode. Currently there is not
> much infrastructure for behaving differently in both cases.
>
> It may be a bit easier to implement if there was something at the
> beginning of the expression to let the lexer recognize that this is
> not your normal string expression and that the " =~ " is a special
> token and not a normal substring.
>
> Maybe like "expr=%! %{REMOTE_USER} =~ s/([^@]*)@.*/$1/"
>
> But it must not be too complicated. We don't want an unreadable mess
> like the sh/bash string manipulation functions.

Yes, I agree. When starting to think closer, I noticed that the string 
mode currently only supports a syntax that is pretty different from the 
boolean mode and is much more limited. In that mode everything is a 
string except it is marked via %{XXX}, in which case XXX is a variable 
name, except XXX is AAA:BBB in which case it is AAA("BBB").

So AFAIK we don't support functions with more than one argument in 
string mode and my naive idea of using "STRING =~ 
s/PATTERN/REPLACEMENT/FLAGS" runs into the problem, that we currently 
don't support operators like "=~" etc. in string mode.

So I wonder whether it would be useful to allow for a more general mode 
which would depending on operators or functions handle the argument and 
result as strings or booleans using auto conversion between them where 
needed. Of course in that mode verbatim strings would need proper 
quoting (unlike pure string mode in which everything by default is a 
verbatim string). We could then even support

     BOOLEXPR ? STRINGEXPR1 : STRINGEXPR 2

For compatibility that generalized mode would probably need a mode 
differentiator syntax for compatibility reasons in 2.4 but could be the 
default mode in trunk. Something like your "%!" prefix.

Regards,

Rainer

Mime
View raw message