Mailing-List: contact dev-help@spamassassin.apache.org; run by ezmlm
Precedence: bulk
Received-SPF: pass (hermes.apache.org: domain of
 apache@bugzilla.spamassassin.org designates 64.142.3.173 as permitted sender)
From: bugzilla-daemon@bugzilla.spamassassin.org
To: dev@spamassassin.apache.org
Subject: [Bug 3743] Spamd not cleaning up defunct processes
Message-Id: <20040903011602.CBC28840D6@bugzilla.spamassassin.org>
Date: Thu,  2 Sep 2004 18:16:02 -0700 (PDT)

http://bugzilla.spamassassin.org/show_bug.cgi?id=3743


------- Additional Comments From felicity@kluge.net  2004-09-02 18:16 -------
Subject: Re:  Spamd not cleaning up defunct processes

On Thu, Sep 02, 2004 at 05:53:04PM -0700, bugzilla-daemon@bugzilla.spamassassin.org wrote:
> We suspect the problem has to do with a SIGCHLD being received whilst the spamd
> process is already in the wait state cleaning up another SIGCHLD signal; hence
> the second SIGCHLD is missed. We have made the following changes to use the
> waitpid call instead which will reap all children that need a cleanup. We are
> currently testing the change and will report back if it fixes the problem.

Hrm, ok.  As an FYI, child_handler (and the rest of the pre-fork stuff)
was essentially pulled right out of the perlipc doc.  Looking at the
code, it does seem like we ought to wait until the end to reset the
handler too...

>  sub child_handler {
>    my ($sig) = @_;
>    $SIG{CHLD} = \&child_handler;    # reset as necessary

yeah there...  according to the perlipc page, there's a SysV issue
which makes it have to go after the wait, which we apparently don't do
(why didn't I do that?). :(

> -  logmsg("server hit by SIG$sig");
> -  while((my $pid = waitpid(-1, &WNOHANG)) > 0) {  # reap the child
> -    delete $children{$pid};                       # remove the child out of the
> pool
> -    logmsg("  cleaned up child pid $pid");
> -  }
> +  my $pid = wait;                  # reap the child
> +  delete $children{$pid};          # remove the child out of the pool
> +  logmsg("server hit by SIG$sig, pid $pid");

Errr...  I liked the waitpid() version because it can do NOHANG, and because
the first handler call could catch more children if they exited around the
same time.  wait() causes the process to hang for a child, which makes me
think of bad things depending on how the handler is called.

I'd say the first thing to try is just moving the handler reset to the bottom
of the function, and see if that helps.  If not, progress forward. :)


------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.