httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Luca Toscano <toscano.l...@gmail.com>
Subject Re: [users@httpd] mod_substitute only replaces first pattern match
Date Tue, 14 Feb 2017 15:36:43 GMT
Hi!

2017-02-06 17:25 GMT+01:00 <Uwe.Poliak@amann.com>:

> Hi,
>
> I am trying a reverse proxy server based on apache httpd v2.4 on the most
> recent release of CentOS:
>
> # httpd -version
> Server version: Apache/2.4.6 (CentOS)
> Server built:   Nov 14 2016 18:04:44
>
> # uname -a
> Linux hostname.domain.tld 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18
> 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>
> # cat /etc/centos-release
> CentOS Linux release 7.3.1611 (Core)
>
> Within this configuration I have to use mod_substitute to rewrite URLs
> from some applications.
> For this I am using mod_filter with the SUBSTITUTE Filter as follows:
>
>   ProxyRequests Off
>   ProxyPass /my-location https://my-server.domain.tld/
>
>   <Location /my-location/>
>     ProxyPassReverse    /my-location
>
>     FilterDeclare       AGFILTER
>
>     FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~
> m#^text/html#"
>     FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~
> m#.*/css#"
>     FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~
> m#.*/json#"
>     FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~
> m#.*/javascript#"
>
>     FilterChain         AGFILTER
>
>     Substitute          "s#/(css|js|images|management|
> system|help)/(.*)#/my-location/$1/$2#fi"
>   </Location>
>
> It works fine if there is only one occurrence of the search pattern in a
> line in the html code. This occurrence will be replaced properly.
> However, if there are two or more occurrences of the search pattern in one
> html line, only the first one is replaced. It looks like this example:
>
> <tr><th colspan=3 nowrap></th><th colspan=3 nowrap><a
> href="overview.epl?hide=1"><img border=0 src="/my-location/images/hide.gif"
> alt=" Spalte ausblenden"></a> <a href="overview.epl?cl=1&pl=1"><img
> src="/images/right.gif" border=0 alt=" Spalte nach rechts
> schieben"></a></th><th colspan=3 nowrap><a href="overview.epl?cl=2&pl=-1"><img
> src="/images/left.gif" alt=" Spalte nach links schieben" border=0 ></a> <a
> ....
>
> Here you see: The first one is replaced, the second image URL is the same
> as before.
>
> Is this works-as-designed?
>

I think that the issue is in the (.*) of your regex. In your example it
will match the first occurrence of the pattern (like "images/") and will
end up eating all the rest of the chars (greedy behavior as far as I can
see). The following matches more than on occurrences in your example
string, because it checks for the .something extension:

(css|js|images|management|system|help)\/(\w+\.\w)

So mod_substitute seems to be working fine, the regex would needs a bit of
tuning imo. The documentation might need to mention the greedy behavior,
but I need to triple check that what I just said makes sense :)

Hope that helps!

Luca

Mime
View raw message