Return-Path: Delivered-To: apmail-apr-dev-archive@www.apache.org Received: (qmail 16753 invoked from network); 27 Nov 2007 19:45:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Nov 2007 19:45:37 -0000 Received: (qmail 26738 invoked by uid 500); 27 Nov 2007 19:45:19 -0000 Delivered-To: apmail-apr-dev-archive@apr.apache.org Received: (qmail 26647 invoked by uid 500); 27 Nov 2007 19:45:19 -0000 Mailing-List: contact dev-help@apr.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Id: Delivered-To: mailing list dev@apr.apache.org Received: (qmail 26614 invoked by uid 99); 27 Nov 2007 19:45:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Nov 2007 11:45:19 -0800 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of erik@lotspeich.org designates 76.194.30.225 as permitted sender) Received: from [76.194.30.225] (HELO starfish.lotspeich.org) (76.194.30.225) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Nov 2007 19:45:21 +0000 Received-SPF: .lotspeich.org: domain of erik@lotspeich.org designates 127.0.0.1 as permitted sender) receiver=starfish.lotspeich.org; client_ip=127.0.0.1; envelope-from=erik@lotspeich.org; Received: from starfish.lotspeich.org (localhost [127.0.0.1]) by starfish.lotspeich.org (8.13.6.20060614/8.13.6) with ESMTP id lARJj0bt014796 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 27 Nov 2007 11:45:00 -0800 Received: from localhost (erik@localhost) by starfish.lotspeich.org (8.13.6.20060614/8.12.10/Submit) with ESMTP id lARJj01k014793; Tue, 27 Nov 2007 11:45:00 -0800 Date: Tue, 27 Nov 2007 11:45:00 -0800 (PST) From: Erik Lotspeich To: Henry Jen cc: dev@apr.apache.org Subject: Re: Please read: Fix for APR bug In-Reply-To: <369011d00711262247m32718dc6q5c73cb13f09e8081@mail.gmail.com> Message-ID: References: <200711242317.31539.erik@lotspeich.org> <369011d00711262247m32718dc6q5c73cb13f09e8081@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Checked: Checked by ClamAV on apache.org Hi Henry, Thank you very much for your response. I believe that you are correct in that an error code would fail to generate an APR_CHILD_DONE status while the process may be no longer running. My fix would not be valid in this case. I will continue to debug the problem with suspicion placed on apr_proc_wait as the culprit. I'll post another patch once I have completed this task. Regards, Erik. On Mon, 26 Nov 2007, Henry Jen wrote: > On Nov 26, 2007 4:25 PM, Erik Lotspeich wrote: >> Hi all: >> >> I've posted a few messages regarding this topic. I believe that there >> must be someone interested in this topic who wrote the apr_pools.c >> originally or has worked with apr_pool_note_subprocess(). If not, then I >> suppose that would explain the lack of interest in this problem. >> > > It can be hard sometimes to get attention you would like to have, > especially when others are not experiencing the problem immediately. > I read the thread, however I did not spend time in this area, so I > don't think I am qualified to reply. > >> To summarize, I believe that there is a bug either in apr_pools.c (in the >> function free_proc_chain()) or in apr_proc_wait(). I posted a patch to >> this list with a fix that does work. I didn't begin to dive into >> apr_proc_wait() (which may be the true source of the problem) until I got >> some feedback from the developers on this list. >> > > After a little study, I guess the question is that when the return > value of apr_proc_wait is not APR_CHILD_NOTDONE, does it guarantee > there is a subprocess still running? My reading to man page of waitpid > on Solaris does not indicate that. > > So, flip the condition may solve your problem but may cause other issues. > > Anyhow, others may know better than I do. > > Cheers, > Henry > >> Any response would be greatly appreciated. >> >> Regards, >> >> Erik. >> >> On Sat, 24 Nov 2007, Erik Lotspeich wrote: >> >>> Hi: >>> >>> I continued my investigation as to the reason why apr_pool_note_subprocess() >>> is not working for me. It seems that there is a bug in apr_pools.c in the >>> function free_proc_chain(). Here's my patch: >>> >>> --- apr_pools.c 2007-11-24 23:06:12.000000000 -0800 >>> +++ apr_pools.c+ 2007-11-24 23:06:01.000000000 -0800 >>> @@ -2118,7 +2118,7 @@ >>> #ifndef NEED_WAITPID >>> /* Pick up all defunct processes */ >>> for (pc = procs; pc; pc = pc->next) { >>> - if (apr_proc_wait(pc->proc, NULL, NULL, APR_NOWAIT) != >>> APR_CHILD_NOTDONE) >>> + if (apr_proc_wait(pc->proc, NULL, NULL, APR_NOWAIT) == >>> APR_CHILD_DONE) >>> pc->kill_how = APR_KILL_NEVER; >>> } >>> #endif /* !defined(NEED_WAITPID) */ >>> >>> It may be true that apr_proc_wait is the original source of the problem. In >>> theory, both the previous and patched versions of the code above are >>> equivalent. In reality, the doxygen documentation is incomplete. >>> apr_proc_wait can return codes other than APR_CHILD_DONE and >>> APR_CHILD_NOTDONE. In my case, the process I to be checked was actually >>> running, but an APR_CHILD_NOTDONE code wasn't being set! >>> >>> Any questions or comments would be appreciated. >>> >>> Regards, >>> >>> Erik. >>> >> >