httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yann Ylavic <ylavic....@gmail.com>
Subject Re: [PATCH] Fix settings options with ProxyPassMatch
Date Tue, 29 Apr 2014 22:53:40 GMT
On Tue, Apr 29, 2014 at 3:51 PM, Jim Jagielski <jim@jagunet.com> wrote:
> On Apr 29, 2014, at 8:41 AM, Jan Kalu┼ża <jkaluza@redhat.com> wrote:
>>
>> Because later we have to match the URL of request with some proxy_worker.
>>
>> If you configure ProxyPassMatch like this:
>> ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
>>
>> Then the proxy_worker name would be "http://x/$1/foo.jpg".
>>
>> If you receive request with URL "http://x/something/foo.jpg", ap_proxy_get_worker()
will have to find out the worker with name "http://x/$1/foo.jpg". The question here is how
it would do that?
>>
>> The answer used in the patch is "we change the worker name to http://x/*/foo.jpg"
and check if the URL ("http://x/something/foo.jpg" in our case) matches that worker.
>>
>> If we store the original name with $N, we will have to find out different way how
to match the worker (probably emulating wildcard pattern matching)
>>
>> It would be possible to store only the original name (with "$N" variables), store
the flag that the proxy worker is using regex and change ap_proxy_strcmp_ematch() function
to treat "$N" as "*", but I don't see any real advantage here.
>>
>
> In Yann's suggested patch we don't store match_name where it
> belongs; so we'd need to put it in shm, which means more
> memory.

Agreed, plus this is not balancer-manager aware.

BTW, what's the difference between alias_match() used by proxy_trans()
and ap_proxy_get_worker()? Longest match?
Can an entry matched by proxy_trans() *not* belong to the worker
got(ten) later from ap_proxy_get_worker()?
If no, another solution would be to backref the worker in (all) its
struct proxy_alias(es) entries.
That way the worker would be already known at proxy_trans() time (when
the entry is matched), and a new ap_proxy_get_worker_for_request(r)
could do the association later.
AFAICT, we don't use ap_proxy_get_worker() at runtime without a
request_rec available.

At least that could work for the *Match workers, for which the only
relevent requested-URL's match is from proxy_trans(), imo.

Still another solution for these workers would be to reuse the
ap_regmatch_t vector from proxy_trans() to exact match the worker's
name (with its zero or more $N replaced with strings offsets from
vector[N], like ap_expr_str_exec_re() does).
That would also require a request_rec available at
ap_proxy_get_worker()'s (run)time though.

> Instead, we store as is and add a simple char flag
> which sez if the stored name is a regex. Much savings.
>
> And I have no idea why storing with $1 -> * somehow makes
> things easier or implies a "different way how to match the worker".

Do we need to provide a way to escape (application/legitimate) $N in
the worker name or simply document on the limitation?
In the latter case this is indeed much simpler.

>
> Finally, let's think about this deeper...
>
> Assume we do have
>
>         ProxyPassMatch ^/test/(\d+)/foo.jpg http://x/$1/foo.jpg
>         ProxyPassMatch ^/zippy/(\d+)/bar.jpg http://x/$1/omar/propjoe.gif
>
> is the intent/desire to have 2 workers or 1? A worker is, in
> some ways, simply a nickname for the socket related to a host and port.

For which connections can be reused, different parameters apply...

> Maybe, in the interests of efficiency and speed, since regexes
> are slow as it is, a condition could be specified (a limitation,
> as it were), that when using PPM, only everything up to
> the 1st potential substitution is considered a unique worker.

That could be (another) limitation.
But one may want to apply different parameters to these somehow
different URLs, since they may be different backends/applications too.

Mime
View raw message