commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: RESULT: Failed [VOTE] Release DBCP 1.3/1.4 - take three
Date Sun, 03 Jan 2010 21:09:25 GMT
sebb wrote:
> On 03/01/2010, Phil Steitz <phil.steitz@gmail.com> wrote:
>> sebb wrote:
>>  > On 03/01/2010, sebb <sebbaz@gmail.com> wrote:
>>  >> On 02/01/2010, Phil Steitz <phil.steitz@gmail.com> wrote:
>>  >>  > sebb wrote:
>>  >>  >  > On 01/01/2010, Phil Steitz <phil.steitz@gmail.com> wrote:
>>  >>  >  >> Phil Steitz wrote:
>>  >>  >  >>  > sebb wrote:
>>  >>  >  >>  >> On 31/12/2009, Phil Steitz <phil.steitz@gmail.com>
wrote:
>>  >>  >  >>  >>> Comments have not changed sebb's -1, so I
am going to consider this
>>  >>  >  >>  >>>  a failed VOTE and roll another RC with documentation
fixes already
>>  >>  >  >>  >>>  made included and attempt at clearer release
notes and README.
>>  >>  >  >>  >>>
>>  >>  >  >>  >>>  Thanks, all for review and sorry to take
so long to get this right.
>>  >>  >  >>  >> Please note that I am still seeing the occasional
test failures (even
>>  >>  >  >>  >> after the test bug fix).
>>  >>  >  >>  >> As a result, I did not revisit the -1 for the
compilation problems -
>>  >>  >  >>  >> the test failure seems like a -1 to me as well.
>>  >>  >  >>  >
>>  >>  >  >>  > In that case, I am honestly inclined to just remove
/ disable the
>>  >>  >  >>  > tests.  As I said before, they are fragile and frankly
half-baked.
>>  >>  >  >>  > Unfortunately, they did uncover a real bug recently,
so I am of two
>>  >>  >  >>  > minds on this.
>>  >>  >  >>  >
>>  >>  >  >>  > What is going on in the most recent failure you reported
(line 376
>>  >>  >  >>  > of TestPerUserPoolDataSource) is a ThreadGroup is
started launching
>>  >>  >  >>  > 2 * maxActive threads, all of which try to get connections,
hold
>>  >>  >  >>  > them for (sic) 1 ms and then release them.  MaxWait
is 100 ms and
>>  >>  >  >>  > maxActive is 10.   This "should" work as the effective
throughput
>>  >>  >  >>  > should be ~10 requests / ms (that assumes perfect
efficiency and no
>>  >>  >  >>  > execution time, which is not quite right), so 20
requests should
>>  >>  >  >>  > complete in ~20 ms.
>>  >>  >  >>
>>  >>  >  >>
>>  >>  >  >> Sorry - that should be 2 ms.
>>  >>  >  >
>>  >>  >  > If maxWait is 100ms, and each thread waits 1ms, surely this
should always work?
>>  >>  >  > Even if each wait actually takes 50ms, the first 10 requests
will take
>>  >>  >  > approx 50ms, and the remaining 10 requests will then get their
>>  >>  >  > connections.
>>  >>  >  >
>>  >>  >  > In the tests I ran last year (!), at least some of the failed
tests
>>  >>  >  > showed that 10 of the threads timed out, i.e. none of the original
10
>>  >>  >  > threads had finished. It seems a bit unlikely that this is
really an
>>  >>  >  > issue with the processing times.
>>  >>  >  >
>>  >>  >  > I think this needs closer investigation - I'll try and add
some more
>>  >>  >  > debug for the failed cases.
>>  >>  >
>>  >>  >
>>  >>  > Thanks.  I just completed 1000 runs each using Apple 1.5, 1.6, Sun
>>  >>  >  1.6 and JRockit 1.4 (last two on Ubuntu 9.10) with no failures.
>>  >>
>>  >>
>>  >> Any tests using multiple core systems?
>>  >>
>>  >>
>>  >>  >  You are correct that with maxActive = 10, throughput should be
>>  >>  >  nearly 10/ms, so 20 should finish in 2ms.  There are three things
>>  >>  >  that can dampen the throughput:
>>  >>  >
>>  >>  >  1) Elapsed time between when a thread invokes sleep(1) and performs
>>  >>  >  its next action (which is to return the connection it is holding)
>>  >>  >  2) Elapsed time waiting for a waiting thread to respond to notify
>>  >>  >  3) There is a trivial amount of code executed by the threads holding
>>  >>  >  the connections and of course the pool itself executes some code.
>>  >>  >
>>  >>  >  What JDK are you using when you see these failures?
>>  >>
>>  >>
>>  >> java version "1.6.0_17"
>>  >>  Java(TM) SE Runtime Environment (build 1.6.0_17-b04)
>>  >>  Java HotSpot(TM) Client VM (build 14.3-b01, mixed mode, sharing)
>>  >>
>>  >>  This is on Windows XP, dual-processor (Centrino).
>>  >>
>>  >>  There is another bug in the test - it does not wait for all the
>>  >>  threads to finish.
>>  >>  However, I don't think this affects the result, as the first test is
>>  >>  the one that fails, so there can't be any threads at that point.
>>  >>  However it could affect the second test, as the same driver and pool
>>  >>  is used. The two tests should probably be separate test cases.
>>  >>
>>  >
>>  > When a test fails for me, 10 threads get timeouts.
>>  > All the first 10 threads take longer than 100ms to complete and all
>>  > take about the same amount of time (within 5ms or so).
>>
>>
>> There should be 20 threads launched by the test that does not expect
>>  timeouts.  So 10 are completing in time and 10 are timing out?
> 
> 10 complete without any failures, however they all take over 100ms to
> complete - e.g. 160ms or 200ms - and so the other 10 threads suffer
> timeouts.
> 
> 
>>  > This does not seem to be due to cpu starvation, because the timeouts
>>  > occur some while before the first 10 threads complete. This suggests
>>  > to me that the JVM is not being stalled by garbage collection or
>>  > external activities.
>>
>>
>> I doubt it is either CPU starvation or garbage collection, but it
>>  could be clock resolution or thread scheduling.
> 
> Looks like it might be thread scheduling.
> 
> I've added some System.nanoTime() calls around all the method calls in
> the run() method, and so far all the failures occur when
> Thread.sleep(1) takes much longer than 1ms.

Yes, I read somewhere that this is not guaranteed to complete in <
10 ms on any platform and can take longer on Windows.
> 
> Normally, this only takes 1-30ms, but every so often the sleep lasts for 100+ms.
> 
> Not quite sure how to fix this.
> Perhaps increase maxWait() for this particular test? It will need to
> be at least 350ms, judging by some of the recent test runs.

I thought about doing that, but that sort of defeats the purpose of
the test.  This is one reason that I was thinking about disabling
it, but as I said before, these tests did point to a real bug
before, so I would actually like to rectify if possible.  Maybe just
increasing maxWait (for this case only) is a good idea.

> 
> The debugging also shows clearly that the threads started by the test
> case do not finish before the method completes. In fact in one test,
> the test method multipleThreads() finished (and returned the value of
> success[0]) before the first thread completed. As it happened, the
> first thread failed, but of course the failure was not caught because
> the success[0] variable had already been read.
> 
> I can fix this by using Thread.join(). This will also guarantee that
> the success[0] variable is made visible to the test thread, which at
> present is not necessarily the case.

That would probably be an improvement.

I am about to jump on a plane, so will be dark until tomorrow (US
EST).  Thanks for chasing this down.  Pls go ahead and commit
improvements if you can get it working consistently and you are
satisfied that there are no real bugs lurking.

Phil
> 
>>  >
>>  > I don't know yet which part of the thread is taking the most time.
>>  > I'll add more detailed timers tomorrow; hopefully this will give a
>>  > better clue as to what is happening.
>>  >
>>  >>  >  One thing to look at to rule out a [pool] bug is to see if you get
>>  >>  >  failures using pool 1.4.
>>  >>  >
>>  >>
>>  >>
>>  >> Not sure I follow - the pom uses specifies pool 1.5.4, so why would
>>  >>  using pool 1.4 help?
>>  >>
>>  >>
>>  >>  >
>>  >>  >  >
>>  >>  >  >>   The test waits 100 ms.  Given the fact that
>>  >>  >  >>  > perfect efficiency is obviously unrealistic, you
can see that
>>  >>  >  >>  > especially with bad clock resolution and poor thread
management
>>  >>  >  >>  > performance (Windoz is known for both), this is going
to fail now
>>  >>  >  >>  > and then. FWIW, I have not seen a failure on OS X
or Ubuntu (as OS X
>>  >>  >  >>  > guest) since sebb's last patch.
>>  >>  >  >>  >
>>  >>  >  >>  > Barring objections, I am leaning toward removing
the tests.
>>  >>  >  >>  >
>>  >>  >  >>  > Phil
>>  >>  >  >>  >> I hope to try and look at the failures again
tomorrow.
>>  >>  >  >>  >>
>>  >>  >  >>  >> It would be helpful if others could try running
the failing test as
>>  >>  >  >>  >> well (you'll need a script to do this as it only
fails about 1% of the
>>  >>  >  >>  >> time or less)
>>  >>  >  >>  >>
>>  >>  >  >>  >>>  Phil
>>  >>  >  >>  >>>
>>  >>  >  >>  >>>  Phil Steitz wrote:
>>  >>  >  >>  >>>  > Hopefully all problems with JDK versions
and the site build have now
>>  >>  >  >>  >>>  > been resolved.  As previously discussed,
the only difference between
>>  >>  >  >>  >>>  > 1.3 and 1.4 is that the 1.3 sources
have been filtered to exclude
>>  >>  >  >>  >>>  > JDBC4 methods.  Version 1.3 is for
JDK 1.4-1.5 and only builds under
>>  >>  >  >>  >>>  > one of these JDKs.  Note that to execute
the 1.3 maven build under
>>  >>  >  >>  >>>  > JDK 1.4 you need a 2.0.x version of
maven.
>>  >>  >  >>  >>>  >
>>  >>  >  >>  >>>  > Here are the artifacts:
>>  >>  >  >>  >>>  >
>>  >>  >  >>  >>>  > 1.3 (JDBC 3) version:
>>  >>  >  >>  >>>  > http://people.apache.org/~psteitz/dbcp-1.3-rc6
>>  >>  >  >>  >>>  > http://people.apache.org/~psteitz/dbcp-1.3-rc6/site
>>  >>  >  >>  >>>  > http://people.apache.org/~psteitz/dbcp-1.3-rc6/maven
>>  >>  >  >>  >>>  > http://svn.apache.org/repos/asf/commons/proper/dbcp/tags/DBCP_1_3_RC6/
>>  >>  >  >>  >>>  >
>>  >>  >  >>  >>>  > 1.4 (JDBC 4) version:
>>  >>  >  >>  >>>  > http://people.apache.org/~psteitz/dbcp-1.4-rc6
>>  >>  >  >>  >>>  > http://people.apache.org/~psteitz/dbcp-1.4-rc6/site
>>  >>  >  >>  >>>  > http://people.apache.org/~psteitz/dbcp-1.4-rc6/maven
>>  >>  >  >>  >>>  > http://svn.apache.org/repos/asf/commons/proper/dbcp/tags/DBCP_1_4_RC6/
>>  >>  >  >>  >>>  >
>>  >>  >  >>  >>>  > Release notes (common version, ships
with both)
>>  >>  >  >>  >>>  > http://people.apache.org/~psteitz/RELEASE-NOTES.txt
>>  >>  >  >>  >>>  >
>>  >>  >  >>  >>>  > Votes, please. This VOTE will close
01-January-2010 03:30 GMT.
>>  >>  >  >>  >>>  >
>>  >>  >  >>  >>>  > [ ] +1 Proceed with release
>>  >>  >  >>  >>>  > [ ] +0 OK
>>  >>  >  >>  >>>  > [ ] -0 OK, but I would prefer...
>>  >>  >  >>  >>>  > [ ] -1 No, showstopper = ...
>>  >>  >  >>  >>>  >
>>  >>  >  >>  >>>  > Thanks!
>>  >>  >  >>  >>>  >
>>  >>  >  >>  >>>  > Phil
>>  >>  >  >>  >>>
>>  >>  >  >>  >>>
>>  >>  >  >>  >>>  ---------------------------------------------------------------------
>>  >>  >  >>  >>>  To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>  >>  >  >>  >>>  For additional commands, e-mail: dev-help@commons.apache.org
>>  >>  >  >>  >>>
>>  >>  >  >>  >>>
>>  >>  >  >>  >> ---------------------------------------------------------------------
>>  >>  >  >>  >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>  >>  >  >>  >> For additional commands, e-mail: dev-help@commons.apache.org
>>  >>  >  >>  >>
>>  >>  >  >>  >
>>  >>  >  >>
>>  >>  >  >>
>>  >>  >  >>  ---------------------------------------------------------------------
>>  >>  >  >>  To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>  >>  >  >>  For additional commands, e-mail: dev-help@commons.apache.org
>>  >>  >  >>
>>  >>  >  >>
>>  >>  >  >
>>  >>  >  > ---------------------------------------------------------------------
>>  >>  >  > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>  >>  >  > For additional commands, e-mail: dev-help@commons.apache.org
>>  >>  >  >
>>  >>  >
>>  >>  >
>>  >>  >  ---------------------------------------------------------------------
>>  >>  >  To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>  >>  >  For additional commands, e-mail: dev-help@commons.apache.org
>>  >>  >
>>  >>  >
>>  >>
>>  >
>>  > ---------------------------------------------------------------------
>>  > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>  > For additional commands, e-mail: dev-help@commons.apache.org
>>  >
>>
>>
>>  ---------------------------------------------------------------------
>>  To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>>  For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message