httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Uwe.Pol...@amann.com>
Subject RE: [users@httpd] mod_substitute only replaces first pattern match
Date Thu, 09 Mar 2017 01:08:58 GMT
Hi Luca,

sorry for my late reply, but your suggestion worked well!
It was really a problem with the (.*) pattern.

Kind regards
Uwe


From: Luca Toscano [mailto:toscano.luca@gmail.com] 
Sent: Tuesday, February 14, 2017 4:37 PM
To: users@httpd.apache.org
Subject: Re: [users@httpd] mod_substitute only replaces first pattern match

Hi!

2017-02-06 17:25 GMT+01:00 <mailto:Uwe.Poliak@amann.com>:
Hi,

I am trying a reverse proxy server based on apache httpd v2.4 on the most recent release of
CentOS:

# httpd -version
Server version: Apache/2.4.6 (CentOS)
Server built:   Nov 14 2016 18:04:44

# uname -a
Linux hostname.domain.tld 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64
x86_64 x86_64 GNU/Linux

# cat /etc/centos-release
CentOS Linux release 7.3.1611 (Core)

Within this configuration I have to use mod_substitute to rewrite URLs from some applications.
For this I am using mod_filter with the SUBSTITUTE Filter as follows:

  ProxyRequests Off
  ProxyPass /my-location https://my-server.domain.tld/

  <Location /my-location/>
    ProxyPassReverse    /my-location

    FilterDeclare       AGFILTER

    FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~ m#^text/html#"
    FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~ m#.*/css#"
    FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~ m#.*/json#"
    FilterProvider      AGFILTER SUBSTITUTE "%{resp:Content-Type} =~ m#.*/javascript#"

    FilterChain         AGFILTER

    Substitute          "s#/(css|js|images|management|system|help)/(.*)#/my-location/$1/$2#fi"
  </Location>

It works fine if there is only one occurrence of the search pattern in a line in the html
code. This occurrence will be replaced properly.
However, if there are two or more occurrences of the search pattern in one html line, only
the first one is replaced. It looks like this example:

<tr><th colspan=3 nowrap></th><th colspan=3 nowrap><a href="overview.epl?hide=1"><img
border=0 src="/my-location/images/hide.gif" alt=" Spalte ausblenden"></a> <a href="overview.epl?cl=1&pl=1"><img
src="/images/right.gif" border=0 alt=" Spalte nach rechts schieben"></a></th><th
colspan=3 nowrap><a href="overview.epl?cl=2&pl=-1"><img src="/images/left.gif"
alt=" Spalte nach links schieben" border=0 ></a> <a ....

Here you see: The first one is replaced, the second image URL is the same as before.

Is this works-as-designed?

I think that the issue is in the (.*) of your regex. In your example it will match the first
occurrence of the pattern (like "images/") and will end up eating all the rest of the chars
(greedy behavior as far as I can see). The following matches more than on occurrences in your
example string, because it checks for the .something extension:

(css|js|images|management|system|help)\/(\w+\.\w)

So mod_substitute seems to be working fine, the regex would needs a bit of tuning imo. The
documentation might need to mention the greedy behavior, but I need to triple check that what
I just said makes sense :)

Hope that helps!

Luca

 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org
Mime
View raw message