httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruediger Pluem <rpl...@apache.org>
Subject Re: error_time reset in proxy_util.c
Date Fri, 06 Mar 2015 15:25:35 GMT


On 03/05/2015 06:00 PM, Ruediger Pluem wrote:
> 
> 
> On 03/05/2015 05:29 PM, Eric Covener wrote:
>> On Thu, Mar 5, 2015 at 11:14 AM, Ruediger Pluem <rpluem@apache.org> wrote:
>>> I suspect that the worker was already set to error by a parallel thread / process
and hence
>>> PROXY_WORKER_IS_USABLE(worker) is false and causes worker->s->error_time
to be reset which causes the worker to be open
>>> for retry immediately. This has been the case since r104624
>>> (http://svn.apache.org/viewvc?view=revision&revision=r104624) 10,5 years
ago and the commit messages offers no hint at
>>> least to be why we reset these values.
>>> Can anybody think of a good reason why we do this?
>>
>> that sequence sounds plausible
> 
> So do you see any reason why we do this, or should we just stop doing so?
> 
> Another thing:
> 
>     else {
>         if (worker->s->retries) {
>             /*
>              * A worker came back. So here is where we need to
>              * either reset all params to initial conditions or
>              * apply some sort of aging
>              */
>         }
> 
> The comment makes a wrong assumption as we might be here despite the connection failed
because the worker was already in
> error. So we might need to split into two ifs above and separate the conditions where
the connection failed and where we
> need to do something with the worker because it wasn't in error yet. How about:
> 

Index: proxy_util.c
===================================================================
--- proxy_util.c        (revision 1664261)
+++ proxy_util.c        (working copy)
@@ -2887,14 +2887,15 @@

         connected    = 1;
     }
+    if (PROXY_WORKER_IS_USABLE(worker)) {
     /*
      * Put the entire worker to error state if
      * the PROXY_WORKER_IGNORE_ERRORS flag is not set.
      * Altrough some connections may be alive
      * no further connections to the worker could be made
      */
-    if (!connected && PROXY_WORKER_IS_USABLE(worker) &&
-        !(worker->s->status & PROXY_WORKER_IGNORE_ERRORS)) {
+        if (!connected) {
+            if (!(worker->s->status & PROXY_WORKER_IGNORE_ERRORS)) {
         worker->s->error_time = apr_time_now();
         worker->s->status |= PROXY_WORKER_IN_ERROR;
         ap_log_error(APLOG_MARK, APLOG_ERR, 0, s, APLOGNO(00959)
@@ -2902,6 +2903,7 @@
             APR_TIME_T_FMT "s",
             worker->s->hostname, apr_time_sec(worker->s->retry));
     }
+        }
     else {
         if (worker->s->retries) {
             /*
@@ -2915,6 +2917,19 @@
     }
     return connected ? OK : DECLINED;
 }
+    else {
+        /*
+         * The worker is in error likely done by a different thread / process
+         * e.g. for a timeout or bad status. We should respect this and should
+         * not continue with a connection via this worker even if we got one.
+         */
+        if (connected) {
+            apr_socket_close(conn->sock );
+            conn->sock = NULL;
+        }
+        return DECLINED;
+    }
+}

 static apr_status_t connection_shutdown(void *theconn)
 {


The above patch fixes my issue (same one as in previous post but without whitespace changes
for easier reading).
Objections? Otherwise I would commit.

Regards

RĂ¼diger

Mime
View raw message