Return-Path: Delivered-To: apmail-apr-dev-archive@www.apache.org Received: (qmail 85951 invoked from network); 14 Jun 2008 21:25:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 Jun 2008 21:25:19 -0000 Received: (qmail 72886 invoked by uid 500); 14 Jun 2008 21:25:18 -0000 Delivered-To: apmail-apr-dev-archive@apr.apache.org Received: (qmail 72826 invoked by uid 500); 14 Jun 2008 21:25:18 -0000 Mailing-List: contact dev-help@apr.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Id: Delivered-To: mailing list dev@apr.apache.org Received: (qmail 72812 invoked by uid 99); 14 Jun 2008 21:25:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 14 Jun 2008 14:25:18 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.9] (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 14 Jun 2008 21:24:37 +0000 Received: (qmail 85922 invoked by uid 2161); 14 Jun 2008 21:24:55 -0000 Received: from [192.168.2.4] (euler.heimnetz.de [192.168.2.4]) by cerberus.heimnetz.de (Postfix on SuSE Linux 7.0 (i386)) with ESMTP id 9F39B1721C; Sat, 14 Jun 2008 23:24:40 +0200 (CEST) Message-ID: <4854371B.4040203@apache.org> Date: Sat, 14 Jun 2008 23:24:43 +0200 From: Ruediger Pluem User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.13) Gecko/20080313 SeaMonkey/1.1.9 MIME-Version: 1.0 To: APR Developer List Cc: dev@httpd.apache.org Subject: Re: showstopper to 1.3.1? References: <2A740477-F3E3-4C9C-A1EC-5A33AB76FBFD@jaguNET.com> <4850BE97.7030609@force-elite.com> <48542D1B.6040904@rowe-clan.net> In-Reply-To: <48542D1B.6040904@rowe-clan.net> X-Enigmail-Version: 0.95.6 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org On 06/14/2008 10:42 PM, William A. Rowe, Jr. wrote: > Guys, if anyone is looking at this, I'll hold off from tagging a bit > longer, > as I'd rather have apr-1.3.1 address all the platform quirks we identified > in preparing 2.2.9 for release. But if I hear nothing, I'll have to > just move ahead :) > > Bill > > Paul Querna wrote: >> >> On aurora.apache.org, shortly after installing the new version, we hit >> a problem with apr_pollset_poll: >> >> [Thu Jun 12 05:36:51 2008] [error] (70007)The timeout specified has >> expired: apr_pollset_poll: (listen) >> [Thu Jun 12 05:36:52 2008] [notice] caught SIGTERM, shutting down >> >> If you look in worker.c, around line 687, you can see that if do a >> graceful shutdown if we get an unexpected error from apr_pollset_poll. >> >> This appears to be a regression caused by r641661: >> https://svn.apache.org/viewvc?view=rev&revision=641661 >> >> Which was a fix for PR 42580: >> https://issues.apache.org/bugzilla/show_bug.cgi?id=42580 >> >> This appears to be an relative edge case on Solaris 10 -- it hasn't >> happened again, and it is a regression in APR, but relatively small, >> so I am still +1 for httpd-2.2.9 shipping. Is this really a regression in APR or were we just as lucky before as we were after? Code from httpd rv = apr_pollset_poll(pollset, -1, &numdesc, &pdesc); if (rv != APR_SUCCESS) { if (APR_STATUS_IS_EINTR(rv)) { continue; } /* apr_pollset_poll() will only return errors in catastrophic * circumstances. Let's try exiting gracefully, for now. */ ap_log_error(APLOG_MARK, APLOG_ERR, rv, (const server_rec *) ap_server_conf, "apr_pollset_poll: (listen)"); So we the error message logged if apr_pollset_poll returns anything different then APR_SUCCESS or APR_EINTR. So lets have a look at r641661: --- apr/apr/trunk/poll/unix/port.c 2008/03/27 00:31:21 641660 +++ apr/apr/trunk/poll/unix/port.c 2008/03/27 00:46:05 641661 @@ -295,12 +295,7 @@ if (ret == -1) { (*num) = 0; - if (errno == ETIME || errno == EINTR) { - rv = APR_TIMEUP; - } - else { - rv = APR_EGENERAL; - } + rv = apr_get_netos_error(); } else if (nget == 0) { rv = APR_TIMEUP; So the code before said that if port_getn returns -1 (== fails) we return APR_TIMEUP if the error is ETIME or EINTR and APR_EGENERAL. So IMHO the error message (in this IMHO the same) would have been shown with the old code. What is more strange to me is that we get a timeout error ((70007)The timeout specified has expired: apr_pollset_poll:) even thought we called apr_pollset_poll with -1 as timeout which means wait indefinitely or no timeout. The implementation of apr_pollset_poll seems to be correct as it ensures that we supply NULL in this case to port_getn. But OTOH the man page for port_get / port_getn documents timeout behaviour only for port_get (setting timeout parameter to null means not timeout) not for port_getn. So couldn't this be a Solaris bug? Regards RĂ¼diger