commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Steitz" <phil.ste...@gmail.com>
Subject [pool] Serializing makeObject WAS Re: [POOL] Offer of help for a 1.4 release
Date Sat, 05 Jan 2008 23:01:23 GMT
<snip/>
> >
> > The 1.2 / 1.4-RC1 code does "recheck" before initiating additional
> > makes - i.e., it will not initiate a makeObject if an idle object has
> > been returned to the pool or if maxActive has been reached.  I think I
> > understand your point though, but again it doesn't seem natural to use
> > client thread synchronization in the connection pool as a load
> > dampening mechanism for the database.
>
>
> Where is this recheck? I can't see it - all I see is an initial check,

I guess in your terms, I was referring to the initial check.  The
point I was making is that if one or more makes are in progress and an
object is returned (with 1.2 or 1.4-RC1 code), that object will be
available in the pool for subsequent borrowObjects, which will grab it
instead of kicking off more makes.  Also (modulo the off by one
problem from addObject contention), the total number of makes kicked
off is bounded by maxActve - objects already in circulation.  This
does not answer your point below, however.

> but no re-check. I doubt that there can be one in 1.4 RC1 as (as far as
> I see), makeObjects are fully parallel.
> I see your point however that I'm trying to use the conn-pool as a load
> throttling mechanism for the db, which I cannot deny. I agree that it is
> questionable if this is a job, which a connection pool should do.
>
>
> >
> >> Another thing to consider: If the db-server is under high load, creating
> >> connections in parallel probably won't give you any time benefits. While
> >> in idle mode it may be true that:
> >> When I get 1 connection in 100 ms, I also get n (say 4) connections in
> >> ~100ms
> >> under high load situations it is much different as all processors on the
> >> db-server are busy with other jobs. So it will probably look much more
> >> like this:
> >> 1 connection in 2 seconds
> >> 2 connections in 4 seconds
> >> ...
> >
> > I would be interested to see real data on this and also impacts of
> > connection request load spikes on various engines (i.e., how well they
> > handle bursts of "simultaneous" connection requests).
>
> I'll see if I can generate a test scenario with our JMeter load tests to
> get at least some data for our environment.
>
>  > The engines are
>  > going to end up queueing them anyway and it may be better to leave
>  > that concern to them.
>
> That's a very good point (though I think that's just a speculation that
> the engines are queueing them anyway). However I would be still
> concerned in that case that connections, which are not needed anymore,
> are created. Consider this:
>
> Threads A and B are requesting new connection creations. The requests
> are queued up (by the db engine). When the connection for Thread A is
> created, Thread C has already returned his connection to the pool, so B
> could use that one. But since the request for a new connection was
> already issued to the db engine, it will be created now anyway.
>
This is a good point and I have observed this in load tests with
1.2/1.4-RC1 vs 1.3 code.   The 1.2/1.4-RC1 code creates more
connections, more quickly than 1.3 (as expected) at startup and during
load increases.  Under some scenarios (quick ramp or startup burst),
this may result in more connections being created than are needed to
sustain the load (leaving numIdle >> 0 unless and until the evictor
runs).  I guess that's what you want to avoid in your environment.
>
>
> Another thing we should consider, is the question what you are trying to
> achieve by using a pool. I can see 2 main points:
>
> I) providing 'objects' fast when they are requested
> II) avoiding (unnecessary) load on the engine providing the objects
> (e.g. the DB server)
>
> To some degree both points are important, but different use cases might
> have different priorities:
> a) if load on the db server is not the main problem (it has enough power
> to handle even the highest peak with ease), providing connections as
> fast as possible might be the priority. So the pool should create as
> many connections in parallel as are requested.
> b) if load on the db matters much and the speed by which connections are
> returned from the pool are not the main problem, you probably want to
> serialize connection creation
>
Yes.  Another consideration is where do you want to make the backup
happen, when a load spike occurs.  Serializing backs up the client,
leaving things parallel pushes the load to the factory.

> In our use case, we are focussed on II and b, since we are trying to
> keep our db server responsive for as long as possible under peak loads.
>
By using your front end (web/app) as a buffer ;)

> Different applications may be more focussed on I and a, so my proposed
> changes might be contraproductive for them.
>
Yes. It may be ultimately best to make this configurable.  If you are
OK with this approach, what I suggest is that you open a JIRA ticket,
attaching a version of your patch that supports configurability.  Open
the ticket against 1.3 for now but keep the patch as is (i.e. against
1.4 release branch).

While I understand your use case and can see both sides of the issue,
I think that the 1.2/1.4-RC1 behavior is the better, more natural
default for the pool, so would like to keep things as they are now in
1.4-RC1 for 1.4.

Any others with opinions on this, please weigh in.

Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message