httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Darryl Miles <darryl-mailingli...@netbauds.net>
Subject Re: one word syncronize
Date Wed, 20 Jun 2007 23:31:05 GMT
sebb wrote:
> On 14/06/07, Dmytro Fedonin <Dmytro.Fedonin@sun.com> wrote:
>> Looking through 'server/mpm/worker/worker.c' I have found such a
>> combination of TODO/FIXME comments:
>> 1)
>> /* TODO: requests_this_child should be synchronized - aaron */
>> if (requests_this_child <= 0) {
>> 2)
>> requests_this_child--; /* FIXME: should be synchronized - aaron */
>>
>> And I can not see any point here. These are one word CPU operations,
>> thus there is no way to preempt inside this kind of operation. So, one
>> CPU is safe by nature of basic operation. If we have several CPUs they
>> will synchronize caches any way, thus we will never get inconsistent
>> state here. We can only lose time trying to synchronize it in code. Am I
>> not right?

The decrement operation is a read-modify-write cycle, it is possible for 
2 CPUs to overlap their operations, ending up with a observable lost 
decrement.  Since they both end up reading the same initial value.

On IA32/x86 the "DEC" assembly instruction operation can be prefixed by 
the "LOCK" instruction, this makes the CPU continue to assert memory bus 
locking for the duration of the instruction so there is no way for CPU2 
to perform a read access until CPU1 releases control of the memory bus 
when it completes the instruction, this is effectively what atomic_dec() 
enforces.

The amount of performance lost by using atomic_xxx() really is minimal, 
with any luck it might only be that cache-line that remains locked not 
the entire memory bus.


> The decrement operation may be handled as load, decrement, store on
> some architectures, so can be pre-empted by a different CPU.

There is no other way to handle it :)  Memory itself can't perform 
arithmetic operations, so the decrement always happens inside the ALU 
inside the CPU.

It is true that non-SMP aware CPUs might maintain memory bus acquisition 
during the 'decrement' (aka modify) phase of the operation since there 
is no reason not to give it up as they are the only user of memory.

This becomes a performance bottleneck for any SMP capable CPU which has 
a cache that can operate at full CPU clock speeds.  As the 'decrement' 
(aka modify) phase is going to require at least 1 clock cycle to perform 
so why not let another CPU make use of the memory bus.


> Also some hardware architectures (e.g. HP Alpha) have an unusual
> memory model. One CPU may see memory updates in a different order from
> another CPU. Software that relies on the updates being seen across all
> CPUs must use the appropriate memory synchronisation instructions.
> 
> I don't know if these considerations apply to this code.

Memory update ordering applies when considering how 2 or more distinct 
machine words are updated with respect to themselves when those updates 
are observed from another CPU.

The example here is with concerns over a single machine word being 
updated on SMP systems.


Darryl


Mime
View raw message