commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ku...@gmx.de
Subject Re: [POOL] Offer of help for a 1.4 release
Date Fri, 04 Jan 2008 11:09:53 GMT
Mark, Thomas, thanks for your replies,

Phil Steitz wrote:
> On Jan 3, 2008 12:40 PM, Mark Thomas <markt@apache.org> wrote:
>> Christoph Kutzinski wrote:
>>> - creating a new object means the pool is exhausted which in turn usually means
that we have a high-load situation.
>>> - creation of new objects is expensive (probably even more in high-load situations).
This is why we originally used the pool
>>> - so in conclusion it is probably a bad idea to create multiple object in parallel
>> I don't see how serializing object creation can help performance. If you
>> have a test case and some numbers that show otherwise, I would be very
>> interested in taking a look.

I have no test case for this, I just have my reasoning and my 
observation on our live system that connection creation (lets call the 
'objects' database connections as that is probably 95% of the use cases 
of commons-pool :-) ) takes much longer under high load situations:
While connection creation in 'idle' mode takes something between 100 and 
200 ms, it takes several seconds (the longest I've seen was 27 seconds!) 
under peak loads.


>>
>> If you are really worried about the cost of object creation then you can
>> configure the pool to create all the objects at start-up and block until a
>> free object is available.


That is unfortunately not possible under our current configuration as we 
have set up our application servers to use all connections our database 
server can handle when their pools have reached their maximum size.
For example: we have 40 application servers with a pool max-size of 40.
Our database server can just handle (because of its memory 
configuration) 1600 connections.
If we would configure the pools to fetch all connections at startup, we 
would lose the ability to do updates to our application-software (we 
have a 2-stage approach to doing updates: we startup the 2nd stage with 
the new software then configure the load balancer to use the 2nd stage 
and only afterwards stop the 1st stage) without major hassle.




> Thanks for the feedback, Christoph; but I agree with Mark.  I suspect
> most pool users keep the default whenExhaustedAction, which is to
> block.  That means that objects get created a) on demand, when there
> are no idle instances, but maxActive has not been reached b) when
> addObject is invoked to prefill or augment the idle object pool
> explicitly or c) when minIdle is set and the idle object evictor runs.
>  Even when a) happens during a load spike, it is better to do the
> makes in parallel, especially if there is some latency involved and
> there are resources available to process the makes in parallel (which
> will be the case in, e.g. a database connection pool).

Phil, I cannot follow your reasoning here. What makes you think that 
there are "resources available to process the makes in parallel"? What 
resources do you think of anyway? I'm thinking about resources as 
processor-cycles on the database server and these are usually not 
available during peak load times.

I still think that serial connection creation is a good thing as it will 
help to keep unnecessary load from the database server:
As connections borrowed from the pool are held only for a comparably 
short time (at least in our case), the probability that a connection was 
returned to pool by a different thread in the near future is quite high.
So, by serializing connection creation and rechecking, if a connection 
is available, before starting to create a new one, you won't burden the 
db-server with unnecessary load.

Another thing to consider: If the db-server is under high load, creating 
connections in parallel probably won't give you any time benefits. While 
in idle mode it may be true that:
When I get 1 connection in 100 ms, I also get n (say 4) connections in 
~100ms
under high load situations it is much different as all processors on the 
db-server are busy with other jobs. So it will probably look much more 
like this:
1 connection in 2 seconds
2 connections in 4 seconds
...

So in this case serial creation might be even better from the 
application side of view, as you already have 1 connection after 2 
seconds (the 2nd after 4, and so on), while using parallel creation you 
would have to wait 4 seconds to get a connection.


After all these are all considerations by me without any hard evidence 
supporting it, besides some observations I made in our live system 
during peak loads. So you can reject them, if you don't think that they 
would hold for the majority for the pool clients out there. But I still 
think that for our use case this would be the best way to proceed.


Christoph

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message