Return-Path: Delivered-To: apmail-new-httpd-archive@apache.org Received: (qmail 26932 invoked by uid 500); 22 Jun 2001 03:54:45 -0000 Mailing-List: contact new-httpd-help@apache.org; run by ezmlm Precedence: bulk Reply-To: new-httpd@apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list new-httpd@apache.org Received: (qmail 26921 invoked from network); 22 Jun 2001 03:54:43 -0000 Message-ID: <004101c0face$40687b50$6501a8c0@apache> From: "Bill Stoddard" To: References: Subject: Re: plz vote on tagging current CVS as APACHE_2_0_19 Date: Thu, 21 Jun 2001 23:48:41 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 X-Spam-Rating: h31.sny.collab.net 1.6.2 0/1000/N > > > I don't completely understand you comment, but you are quite right that anytime the > > scoreboard needs to be walked, locking will be required. The scoreboard will -not- need to > > be walked except in these relatively rare cases: > > > > 1. During a graceful restart (manipulates elements in the scoreboard) > > 2. During MaxRequestPerChild triggered child shutdown (manipulates elements in the > > scoreboard) > > 3. Processing a mod_status request (walks the scoreboard) > > 4. Processing an SNMP request (walks the scoreboard) > > > > Requests that do not require a scoreboard walk or adding/deleting entries from the > > scoreboard (which is 99.999% or more of all requests; all requests except the ones above) > > will not require locking at all. Let me restate this... mainline requests will NOT be > > required to make any lock calls at all. Ever. Mainline requests can be served and their > > status tracked even when the scoreboard is locked, so this design does not suffer -any- > > performance penalties in the mainline case caused by locking. > > You are looking at this from a Web server point of view. Don't. Look at > this from an ops point of view. How often is a large site going to query > the current connections or child status (for example) using SNMP? Both of > those require a walk of the scoreboard. If you ask for them once a > minute, you are going to seriously hinder the performance of your server. No way! Where are the cycles going? Waiting for locks? Probably not. CPU overhead of calling a lock? If there is no contention (and their most likely will not be contention) the overhed of calling a lock is minimal. Running the linked list? Running a linked list of even 10,000 (which would be a -very- large site) would not be a big deal if you do it even once a minute. I have bench markedstuff like this in the past and it is just not that big a deal as infrequently as you will need to walk the scoreboard. The only real way to resolve this unambigously is to do benchmaarks on implementations of the two designs. Yea, your design will be faster, but not significantly so I'll wager. > > The only reason we have locking at all is to prevent the 4 cases listed above from > > colliding with each other. Even in the 4 cases above, the lock contention will be minimal > > and the performance degradation minimal and perhaps not even measureable. > > > > A few other benefits to Pauls design: > > 1. Eliminates the requirement for compiled in HARD_SERVER_LIMIT or HARD_THREAD_LIMIT. > > You still need to request a certain amount of shared memory when you > start the parent process, and you will require HARD_SERVER_LIMIT and > HARD_THREAD_LIMIT to know how much to request. Either that, or whenever > you add a new child, you will need to allocate more shared memory, and > somehow tell your child processes about it. > > > 2. You don't need to allocate child score if you don't care about mod_status (but it can > > be added during a restart) > > You still need to request a certain amount of shared memory when you > start the parent process, and you will require HARD_SERVER_LIMIT and > HARD_THREAD_LIMIT to know how much to request. Either that, or whenever > you restart, you will forget all information about your child processes. > Yadda yadda. Like I said in my first post, communicating the design via email is not easy. You can eliminate HARD_SERVER_LIMIT and HARD_THREAD_LIMIT with Paul's design. Sorry I don;t care to explain it via e-mail :-) Maybe Paul or Jeff will step up :-) > > 3. If you choose to not enable mod_status, you will likely see a nice performance boost on > > multi CPU machines because we are not invalidating a CPUs cache each time we touch a > > worker score entry (which is on every request). > > This can be implemented just as well with the table implementation. The > only thing you have to do is pad the scoreboard entry size to make it > equal to one cache line. > And waste more storage? I am not a CPU designer...how big are cache lines on the different CPUs? Wouldn;t padding the storage just cause you to load your cache with memory that is NOT USED at all? I don't see the goodness in that. This is going way far out but with Paul's design, you could bind a set of threads to a particular processor, then allocate your scoreboard storage for these threads in contiguous storage. That way, all your threads on one processor is accessing storage that has a high degree of locality. Good for the cache. > > 4. Does not require any changes to the MPM. Each MPM can start threads according to its' > > ThreadsPerChild setting w/o needing to pay attention to the scoreboard (I believe your > > design required child processes to wait for worker score slots to become available before > > it can start threads. This is imposing too much unnecessary work on the MPM.). > > You are ignoring one problem with the design. I have now posted about it > three times, and nobody has told me how it will be fixed. You need to > honor MaxClients. If I set MaxClients to 250, then I expect MaxClients to > be 250, regardless of whether those clients are long-lived or not. If > every child can always start threads according to ThreadsPerChild, you > will be violating MaxClients on a heavily loaded server. This means that > you are actually requiring MORE changes to the starting logic than the > table implementation, because each child, will need to determine how many > threads to start at any given time. > We can impose whatever disipline we like on when new child processes are allowed to start. Paul's design doesn't directly specify or more importantly hinder implementing a disipline. Bill