jmeter-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <seb...@gmail.com>
Subject Re: Coordinated Omission
Date Tue, 06 Aug 2013 00:54:03 GMT
On 3 August 2013 21:45, Kirk Pepperdine <kirk.pepperdine@gmail.com> wrote:
> Hi all,
>
> To all that maybe interested. Gil Tene (from Azul Systems) has started a thread on Coordinated
Omissions. it's his name for a problem that I noted in JMeter a while back and for which the
current fix is the start threads on demand check box. The problems Gil describes go much deeper
than the problems that are due to JMeters current threading model and lack of event based
schedulers (which would provide at least a 10x boost in scalability and make it less likely
that JMeter will work to hide bottlenecks in your application). The effect of CO is that the
test harness becomes like a ref making a bad call at a critical point in the game.

This is dependent on the test load; not all tests are affected by the problem.

> Anyways, I shan't repeat the contents of the thread here. You can find it at mechanical-sympathy@googlegroups.com,
a mailing list started by Martin Thompson, well known for his work on the Disruptor framework.

Interesting thread.
As it points out this affects many testing systems; JMeter is not
alone in being affected.

As I understand it, the problem occurs when one or more samples are
delayed because the previous sample did not complete in time.
Assuming that the slowdown is caused by the system under test (SUT),
the samples that cannot be sent are also likely to have taken longer
than usual.
So one overlong sample response can hide several others that would
have occurred, thus affecting the statistics, particularly the high
percentile stats.

To avoid this, we need to ensure that long sample responses do not
prevent the generation of the next sample at the correct time.

There are a couple of ways to mitigate this with the current JMeter
design (which waits for a sample to complete before continuing with
the next).

1) Ensure that each thread only needs to send requests at a relatively
low rate, so that slow responses do not use up all the wait time.
This may not be possible with a single JMeter instance, in which case
use one instance to create the bulk of the load (ignoring the issue of
delayed requests), and a second to measure the sample times. This
second instance should have a low transaction rate per thread so slow
responses don't affect the generated load, and should be used to
derive the statistics.

2) Use timeouts to abandon slow responses so that they cannot cause a
slow down (this might not always be suitable).

It should be fairly obvious from the test timings (particularly the
transaction rate) whether this has happened.

The problem does not invalidate an entire test, but it does mean that
the statistics will be inaccurate if allowance is not made for the
missing sample data.
It should still be possible to draw some useful conclusions about the
behaviour of the SUT, e.g. what load starts to cause response
slowdown; does the SUT start misbehaving in other ways (reporting
errors etc).

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org
For additional commands, e-mail: user-help@jmeter.apache.org


Mime
View raw message