httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Stoddard" <b...@wstoddard.com>
Subject Re: plz vote on tagging current CVS as APACHE_2_0_19
Date Fri, 22 Jun 2001 03:48:41 GMT

>
> > I don't completely understand you comment, but you are quite right that anytime
the
> > scoreboard needs to be walked, locking will be required. The scoreboard will -not-
need to
> > be walked except in these relatively rare cases:
> >
> > 1. During a graceful restart (manipulates elements in the scoreboard)
> > 2. During MaxRequestPerChild triggered child shutdown (manipulates elements in the
> > scoreboard)
> > 3. Processing a mod_status request (walks the scoreboard)
> > 4. Processing an SNMP request (walks the scoreboard)
> >
> > Requests that do not require a scoreboard walk or adding/deleting entries from the
> > scoreboard (which is 99.999% or more of all requests; all requests except the ones
above)
> > will not require locking at all.  Let me restate this... mainline requests will
NOT be
> > required to make any lock calls at all. Ever.  Mainline requests can be served and
their
> > status tracked even when the scoreboard is locked, so this design does not suffer
-any-
> > performance penalties in the mainline case caused by locking.
>
> You are looking at this from a Web server point of view.  Don't.  Look at
> this from an ops point of view.  How often is a large site going to query
> the current connections or child status (for example) using SNMP?  Both of
> those require a walk of the scoreboard.  If you ask for them once a
> minute, you are going to seriously hinder the performance of your server.

No way! Where are the cycles going? Waiting for locks? Probably not.  CPU overhead of calling
a
lock? If there is no contention (and their most likely will not be contention) the overhed
of
calling a lock is minimal.  Running the linked list? Running a linked list of even 10,000
(which
would be a -very- large site) would not be a big deal if you do it even once a minute. I have
bench
markedstuff like this in the past and it is just not that big a deal as infrequently as you
will
need to walk the scoreboard.  The only real way to resolve this unambigously is to do benchmaarks
on
implementations of the two designs. Yea, your design will be faster, but not significantly
so I'll
wager.

> > The only reason we have locking at all is to prevent the 4 cases listed above from
> > colliding with each other.  Even in the 4 cases above, the lock contention will
be minimal
> > and the performance degradation minimal and perhaps not even measureable.
> >
> > A few other benefits to Pauls design:
> > 1. Eliminates the requirement for compiled in HARD_SERVER_LIMIT or HARD_THREAD_LIMIT.
>
> You still need to request a certain amount of shared memory when you
> start the parent process, and you will require HARD_SERVER_LIMIT and
> HARD_THREAD_LIMIT to know how much to request.  Either that, or whenever
> you add a new child, you will need to allocate more shared memory, and
> somehow tell your child processes about it.
>
> > 2. You don't need to allocate child score if you don't care about mod_status (but
it can
> > be added during a restart)
>
> You still need to request a certain amount of shared memory when you
> start the parent process, and you will require HARD_SERVER_LIMIT and
> HARD_THREAD_LIMIT to know how much to request.  Either that, or whenever
> you restart, you will forget all information about your child processes.
>

Yadda yadda.  Like I said in my first post, communicating the design via email is not easy.
 You can
eliminate HARD_SERVER_LIMIT and HARD_THREAD_LIMIT with Paul's design. Sorry I don;t care to
explain
it via e-mail :-)  Maybe Paul or Jeff will step up :-)

> > 3. If you choose to not enable mod_status, you will likely see a nice performance
boost on
> > multi CPU machines because we are not invalidating a CPUs cache each time we touch
a
> > worker score entry (which is on every request).
>
> This can be implemented just as well with the table implementation.  The
> only thing you have to do is pad the scoreboard entry size to make it
> equal to one cache line.
>
And waste more storage? I am not a CPU designer...how big are cache lines on the different
CPUs?
Wouldn;t padding the storage just cause you to load your cache with memory that is NOT USED
at all?
I don't see the goodness in that.  This is going way far out but with Paul's design, you could
bind
a set of threads to a particular processor, then allocate your scoreboard storage for these
threads
in contiguous storage. That way, all your threads on one processor is accessing storage that
has a
high degree of locality. Good for the cache.

> > 4. Does not require any changes to the MPM.  Each MPM can start threads according
to its'
> > ThreadsPerChild setting w/o needing to pay attention to the scoreboard (I believe
your
> > design required child processes to wait for worker score slots to become available
before
> > it can start threads. This is imposing too much unnecessary work on the MPM.).
>
> You are ignoring one problem with the design.  I have now posted about it
> three times, and nobody has told me how it will be fixed.  You need to
> honor MaxClients.  If I set MaxClients to 250, then I expect MaxClients to
> be 250, regardless of whether those clients are long-lived or not.  If
> every child can always start threads according to ThreadsPerChild, you
> will be violating MaxClients on a heavily loaded server.  This means that
> you are actually requiring MORE changes to the starting logic than the
> table implementation, because each child, will need to determine how many
> threads to start at any given time.
>
We can impose whatever disipline we like on when new child processes are allowed to start.
Paul's
design doesn't directly specify or more importantly hinder implementing a disipline.

Bill




Mime
View raw message