From dev-return-64938-apmail-httpd-dev-archive=httpd.apache.org@httpd.apache.org Tue Jul 07 05:21:13 2009 Return-Path: Delivered-To: apmail-httpd-dev-archive@www.apache.org Received: (qmail 59033 invoked from network); 7 Jul 2009 05:21:13 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Jul 2009 05:21:13 -0000 Received: (qmail 81966 invoked by uid 500); 7 Jul 2009 05:21:22 -0000 Delivered-To: apmail-httpd-dev-archive@httpd.apache.org Received: (qmail 81883 invoked by uid 500); 7 Jul 2009 05:21:21 -0000 Mailing-List: contact dev-help@httpd.apache.org; run by ezmlm Precedence: bulk Reply-To: dev@httpd.apache.org list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@httpd.apache.org Received: (qmail 81874 invoked by uid 99); 7 Jul 2009 05:21:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jul 2009 05:21:21 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of paul@querna.org designates 209.85.210.198 as permitted sender) Received: from [209.85.210.198] (HELO mail-yx0-f198.google.com) (209.85.210.198) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jul 2009 05:21:13 +0000 Received: by yxe36 with SMTP id 36so2081319yxe.10 for ; Mon, 06 Jul 2009 22:20:51 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.108.8 with SMTP id g8mr9940723anc.66.1246944051710; Mon, 06 Jul 2009 22:20:51 -0700 (PDT) Date: Mon, 6 Jul 2009 22:20:51 -0700 Message-ID: <4239a4320907062220k21987b38uebbd3e7b1c7f0f37@mail.gmail.com> Subject: Events, Destruction and Locking From: Paul Querna To: dev@httpd.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Can't sleep, so finally writing this email I've been meaning to write for about 7 months now :D One of the challenges in the Simple MPM, and to a smaller degree in the Event MPM, is how to manage memory allocation, destruction, and thread safety. A 'simple' example: - 1) Thread A: Client Connection Created - 2) Thread A: Timer Event Added for 10 seconds in the future to detect IO timeout, - 3) Thread B: Client Socket closes in 9.99 seconds. - 4) Thread C: Timer Event for IO timeout is triggered after 10 seconds The simple answer is placing a Mutex around the connection object. Any operation which two threads are working on the connection, locks this Mutex. This has many problems, the first of which is destruction. In this case, Thread B would start destructing the connection, since the socket was closed, but thread C would already be waiting for this mutex.... and then the object underneath it was just free'ed. To solve this Thread B would unregister all existing (and unfired) triggers/timeouts first. Events would increment a reference count on the connection object, and Thread B would schedule a future event to check this reference count. If the reference count is zero, this timer would free the connection object, if there was still an outstanding reference in a running event, it would schedule itself for a future cleanup attempt. All of this is insanely error prone, difficult to debug, and painful to explain. Pools don't help, but don't really make it worse, and are good enough for the actual cleanup part -- the difficultly lies in knowing *when* you can cleanup an object. A related problem of using Mutex Guards on a connection object is that if a single connection 'locks up' a thread, its feasible forl other worker threads to get stuck waiting for this connection, and we would have no way to 'recover' these lost threads. I think it is possible to write a complete server that deals with all these intricacies and gets everything just 'right', but as soon as you introduce 3rd party module writers, no matter how 'smart' we are, our castle of event goodness will crumble. I am looking for an alternative that doesn't expose all this crazyness of when to free, destruct, or lock things. The best idea I can come up with is for each Connection, it would become 'semi-sticky' to a single thread. Meaning each worker thread would have its own queue of upcoming events to process, and all events for connection X would sit on the same 'queue'. This would prevent two threads waiting for destruction, and other cases of a single connection's mutex locking up all your works, essentially providing basic fault isolation. These queues could be mutable, and you could 'move' a connection between queues, but you would always take all of its events and triggers, and move them together to a different queue. Does the 'connection event queue' idea make sense? I'm not sure I'm expressing the idea fully over email.... but I'll be at OSCON in a few weeks if anyone wants beer :) -Paul