cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Berin Loritsch <blorit...@apache.org>
Subject Pool Rework (was: Re: [Avalon][C2] ComponentPool problems under Apache JMeter)
Date Thu, 01 Mar 2001 14:44:44 GMT
I am going to write out my plan for upgrading the Avalon pool package.
Concidering the fact that Avalon is a Server Framework, and most servers
live in a multithreaded environment, we need some design rework on the
pools.

The AbstractPool was not meant to be in a Multithreaded environment, so
it is suitable for a single threaded application.  However its name should
be changed to reflect that fact and not be confused with the base class
that all pools should inherit from.  The design of the abstract pool is
as optimized for a single threaded environment as you can get, but will
_*never*_ be reliable in any shape or form in a multithreaded environment.
Why?  Because it is subject to a type of race condition that simply
synchronizing the get() and set() method won't solve:

It is built on an array of objects, and has an index pointing to the
current pool member:

[0][1][2][3]
       ^

When a get is called, the pointer is decremented

[0][1][2][3]
    ^

Obviously, when there are no more objects, the pool has to grow.

When a put is called, the pointer is incremented

[0][1][2][3]
       ^
For the problem domain and the efficiency that was trying to be attained,
this is a truly ingenious design.  My kudos to Peter for coming up with it.
However, here is the type of race condition that it *cannot* prevent.

Two threads request a Poolable at the same time, but return them out of
order, also a third thread requests a Poolable after on of the objects
are returned:

Thread 1 get(object indexed 3):
[0][1][2][3]
       ^ <-

Thread 2 get(object indexed 2):
[0][1][2][3]
    ^ <-

Thread 1 put(object indexed 3):
[0][1][2][3]
    -> ^

Thread 3 get(object indexed 2):
[0][1][2][3]
    ^ <-

Thread 2 put(object indexed 2):
[0][1][2][3]
    -> ^

Thread 3 put(object indexed 2):
[0][1][2][3]
       -> ^

So what has happened?  Threads two and three are sharing an object!  Since
the reason for a Pool is to pool high overhead objects that cannot be used
by more than one thread simultaneously, we cannot use any Pool based on this
algorithm in a multithreaded environment--as ingenious as it is.

The ThreadsafePool implementation extends this algorithm and makes the get()
and set() methods synchronized.  This is not _truly_ a ThreadSafe object
because it cannot protect itself against the afformentioned Race Condition.

So how do we protect ourselves in a multithreaded environment?  We have
two methods: Use a Stack, or use two Lists.  Why? Because we need to have
a distinction between *active* objects and *inactive* objects.  Under no
circumstances should a pool return an object that is already being used.
The Stack implementation is easy to implement, but does not keep track of
the active objects.  The two list approach maintains a reference to the
objects in question so that resource limiting is much easier to implement.
NOTE: you can mix a Stack and a List.

It comes down to what is the most efficient object to use, but the gist
of the problem domain that we need to solve is that we cannot have a
Race Condition which has two threads sharing a reference to the same
object.

The initial state is this:

Inactive List          Active List
-------------          -----------
     P1
     P2
     P3
     P4

When multiple threads request an object, it _must_ be taken out of the
Inactive roster and placed in the Active roster (using the scenario above):

Thread 1 and 2 request an object

Inactive List          Active List
-------------          -----------
     P3                     P1 (Thread 1)
     P4                     P2 (Thread 2)

Thread 1 returns an object

Inactive List          Active List
-------------          -----------
     P1                     P2 (Thread 2)
     P3
     P4

Thread 3 requests an object

Inactive List          Active List
-------------          -----------
     P3                     P2 (Thread 2)
     P4                     P1 (Thread 3)

Threads 2 and 3 return their objects

Inactive List          Active List
-------------          -----------
     P1
     P2
     P3
     P4

The important thing to catch here is the sequence of events:

get()
-----

Inactive    TemporalRef      Active
   |           |               |
   | Remove(0) |               |
   |<----------+               |
   |           |    Add(ref)   |
   |           +-------------->|
   |           |               |

put()
-----

Inactive    TemporalRef      Active
   |           |               |
   |           |  Remove(ref)  |
   |           +-------------->|
   |  Add(ref) |               |
   |<----------+               |
   |           |               |

The TemporalRef is the actual object that is being transfered
to and from the requesting thread.  On a get(), we remove the
first item from the list (or stack.pop()), put it in our active
list and return that reference.

Now for the all important reason for using a List in the Active
references.  On a put(), we remove the reference of that Object
from the list (Very important for it to be that particular object).
and add the object to the Inactive list (or stack.push()).

The pure stack approach will also work, but it is blind as to
whether the Pool actually originated the object or not.

Mime
View raw message