apr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ruediger Pluem <rpl...@apache.org>
Subject Re: showstopper to 1.3.1?
Date Sat, 14 Jun 2008 21:24:43 GMT


On 06/14/2008 10:42 PM, William A. Rowe, Jr. wrote:
> Guys, if anyone is looking at this, I'll hold off from tagging a bit 
> longer,
> as I'd rather have apr-1.3.1 address all the platform quirks we identified
> in preparing 2.2.9 for release.  But if I hear nothing, I'll have to 
> just move ahead :)
> 
> Bill
> 
> Paul Querna wrote:
>>
>> On aurora.apache.org, shortly after installing the new version, we hit 
>> a problem with apr_pollset_poll:
>>
>> [Thu Jun 12 05:36:51 2008] [error] (70007)The timeout specified has 
>> expired: apr_pollset_poll: (listen)
>> [Thu Jun 12 05:36:52 2008] [notice] caught SIGTERM, shutting down
>>
>> If you look in worker.c, around line 687, you can see that if do a 
>> graceful shutdown if we get an unexpected error from apr_pollset_poll.
>>
>> This appears to be a regression caused by r641661:
>> https://svn.apache.org/viewvc?view=rev&revision=641661
>>
>> Which was a fix for PR 42580: 
>> https://issues.apache.org/bugzilla/show_bug.cgi?id=42580
>>
>> This appears to be an relative edge case on Solaris 10 -- it hasn't 
>> happened again, and it is a regression in APR, but relatively small, 
>> so I am still +1 for httpd-2.2.9 shipping.

Is this really a regression in APR or were we just as lucky before as we
were after?

Code from httpd

                rv = apr_pollset_poll(pollset, -1, &numdesc, &pdesc);
                 if (rv != APR_SUCCESS) {
                     if (APR_STATUS_IS_EINTR(rv)) {
                         continue;
                     }

                     /* apr_pollset_poll() will only return errors in catastrophic
                      * circumstances. Let's try exiting gracefully, for now. */
                     ap_log_error(APLOG_MARK, APLOG_ERR, rv,
                                  (const server_rec *) ap_server_conf,
                                  "apr_pollset_poll: (listen)");


So we the error message logged if apr_pollset_poll returns anything different then
APR_SUCCESS or APR_EINTR.

So lets have a look at r641661:

--- apr/apr/trunk/poll/unix/port.c	2008/03/27 00:31:21	641660
+++ apr/apr/trunk/poll/unix/port.c	2008/03/27 00:46:05	641661
@@ -295,12 +295,7 @@

      if (ret == -1) {
          (*num) = 0;
-        if (errno == ETIME || errno == EINTR) {
-            rv = APR_TIMEUP;
-        }
-        else {
-            rv = APR_EGENERAL;
-        }
+        rv = apr_get_netos_error();
      }
      else if (nget == 0) {
          rv = APR_TIMEUP;

So the code before said that if port_getn returns -1 (== fails) we return APR_TIMEUP
if the error is ETIME or EINTR and APR_EGENERAL.
So IMHO the error message (in this IMHO the same) would have been shown with the old
code.
What is more strange to me is that we get a timeout error ((70007)The timeout specified has
expired: apr_pollset_poll:) even thought we called apr_pollset_poll with -1 as timeout which
means wait indefinitely or no timeout. The implementation of apr_pollset_poll seems to be
correct as it ensures that we supply NULL in this case to port_getn. But OTOH the man page
for port_get / port_getn documents timeout behaviour only for port_get (setting timeout parameter
to null means not timeout) not for port_getn. So couldn't this be a Solaris bug?

Regards

RĂ¼diger



Mime
View raw message