jmeter-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sebb <>
Subject Re: Coordinated Omission - detection and reporting ONLY
Date Mon, 21 Oct 2013 01:27:09 GMT
On 19 October 2013 17:37, Gil Tene <> wrote:
> On Oct 19, 2013, at 4:45 AM, sebb <>
>  wrote:
>> On 19 October 2013 02:17, Gil Tene <> wrote:
>>>> [Trying again - please do not hijack this thread.]
>>>> The Constant Throughput Timer (CTT) calculates the desired wait time,
>>>> and if this is less than zero - i.e. a sample should already have been
>>>> generated - it could trigger the creation of a failed Assertion (or similar)
>>>> showing the time difference.
>> N.B.     ^^^^^^^^^^^
> I missed your point here. I thought you are looking do add detection without changing
the code.

Some code clearly has to be written ...

> So you are suggesting changing the code for CTT to do this?

Yes; this would be trivial; CTT already has to calculate the delay.

> And if so, I assume you would recommend people add or use CTT for CO detection.


Also another Assertion element could be created that checks whether
throughput is in range.

>>>> Would this be sufficient to detect all CO occurrences?
>>> Two issues:
>>> 1. It would detect that CO probably happened, but not how much of it happened,
Missing 1msec or 1 minute will look the same.
>> Huh? The time difference would be reported - is that not enough?
> As discussed below, if you do report the time, it is useful for pointing out how bad
things are. You'll probably need to somehow accumulate the reported time to make sense of
it though. The interesting information to present is what % of total time was spend in CO.
The pessimistic interpretation (which is the one to take unless you do more detailed analysis
with the data collected) is that for an accumulated CO time of X over a test run of length
Y, 100 * X/Y represents the percent of unreported operations that should be assumed to have
displayed the worst case observed number in the run.
> I'd be careful not to fall into the trap of reporting something like "X" time in CO experienced
over "N" CO occurrences and leading people to average things out. While it would be tempting
to divide X by N, a single huge CO event could dominate behavior next to a thousand tiny ones,
leading to very wrong interpretation of effects. Instead, you could report "Total CO time
X1", with Largest single CO event "X2". Or better yet, collect and report the entire histogram
of CO event lengths.

It would be easy enough to add a delay field to the sample results.
This could then be analysed.

>>> 2. It would only detect CO in test plans that include an actual CTT (Constant
Throughput Timer). It won't work for other timers, or when no timers are used
>> Indeed, but in such cases is there any indication of the expected
>> transaction rate?
> Detection doesn't require a constant or known transaction rate. It just requires that
you know the next transaction was not started when it was supposed to be.

Of course.

> In my experience, the "concurrent user count" injection rate approach is much more common
than the "constant transaction rate" one. The concurrent user count approaches usually have
the individual user threads use constant or semi-random think time emulation instead of a
CTT timer. This does mean that their throughput varies with response time length, but you
can still detect CO in such a test. One way to do so is to calculate an estimated interval
rate based on observing the actual behavior of the test for a while (equivalent to establishing
an estimated throughput through observation), and flagging strong outliers after some confidence
level has been established (e.g. flag things that lie more than 3-4 std. deviations away from
the mean interval).
> That's the sort of thing the OutlierCorrector's detection code does. you can use it to
generically detect CO regardless of the timer used.
> I can see a benefit in using CTT to flag CO detection. It's benefit lies in the fact
that with CTT the user explicitly states an expected transaction rate.

Exactly; it's simple to do.

> I can separately see a benefit to adding a new sampler or timer type that simply detect
CO using a technique like we use in OutlierCorrector. It's benefit comes from applying to
a wider set of scenarios.


> Both can be useful as warning signs. And if users are able to react by fixing their test
to avoid triggering the warnings at all, that would be good.


> However, I am separately pessimistic about users being able to adjust tests to get around
CO. While this is possible in some cases, my experience shows that CO is inherent to actual
system use case behavior, and that in the majority of cases it does not come from misconfigured
testers but from real world behavior. I.e. that real world, actual users interact with the
system with intervals that are shorter than the time the system will stall for occasionally.

This is starting to stray from the subject of this thread which is
about detection and reporting.

>>>> If not, what other metric needs to be checked?
>>> There are various things you can look for.
>>> Our OutlierCorrector work includes a pretty elaborate CO detector. It sits either
inline on the listener notification stream, or parses the log file. The detector identifies
sampler patterns, establishes expected interval between patterns, and detects when the actual
interval falls far above the expected interval. This is a code change to JMeter, but a pretty
localized one.
>> AIUI CO happens when the sample takes much longer than usual, thus
>> (potentially) delaying any subseqent samples.
>> Overlong samples can already be detected using a Delay Assertion.
[Sorry, that should have been Duration Assertion]

> It's not exactly "longer than usual". It's "long enough to cause you to miss sending
the next request out on time".

>You can place some margin on this (and anything other than a CTT probably has to), but
the margin depends on the rest of the test scenario and on the actual system behavior. It
is "hard" for users to figure out how correctly set a Duration response assertion on samplers.
Hard enough that it won't be done in practice IMO.

Again, this is getting off-topic for this thread.

>>>> Even if it is not the only possible cause, would it be useful as a
>>>> starting point?
>>> Yes. As a warning flag saying "throw away these test results".
>> The results are not necessarily useless; at the very least they will
>> show the load at which the system is starting to slow down.
> Correct. I though you were talking about yes/no. With an actual missed-time indicator
(and an accumulator across the run) there is some useful info here..
>>>> I am assuming that the CTT is the primary means of controlling the
>>>> sample request rate.
>>> Unfortunately many test scenarios I've seen use other means. Many people use
other timers or other means for think time emulation.
>> The CTT is not intended for think time emulation. It is intended to
>> ensure a specific transaction rate.
> I'm not saying CTT is used for think time emulation. I'm saying many people *think* in
terms of think time and not in terms of throughput. It's the more natural way of coming at
the problem when they are describing what a user does with the system. They don't think in
terms of "the user is sending me X things per second". They think "they user presses this,
then spends 3 seconds pondering, then presses that, ...". For test plans written by such people,
CTT won't be used, and some other sort of delay timers will be.

The issue then is that there is no indication of what the expected
transaction rate is.
So there is no way of knowing at the start whether the sample elapsed
times are such that CO has occurred.
Though I take your point that analysing the repsonse times over a
sufficient time period can potentialluy be used to show when samples
are longer than usual.

However, I would expect testers to have some idea of maximum
acceptable response time for each type of request, so they should be
able to apply the relevant Duration Assertions. Likewise, they should
have some idea of expected throughput, so should be able to add a CTT
or a Transaction Rate Assertion (if such is created).

There is little point performance testing a system unless you have an
idea what the system is designed to handle.
[Stress testing is a different matter]

>> Think time obviously affects the transaction rate, but is not the sole
>> determinant.
>>>> If there are other elements that are commonly used to control the
>>>> rate, please note them here.
>>>> N.B: this thread is only for discussion of how to detect CO and how to
>>>> report it.
>>> Reporting the existence of CO is an interesting starting point. But the only
right way to deal with such a report showing the existence of CO (with no magnitude or other
metrics) is to say " I guess the data I got is complete crap, so all the stats and graphs
I'm seeing mean nothing".
>> I disagree that the output is useless.
>> The delays are reported, so one can see how badly the test was
>> affected. If the delays are all small, then a slight adjustment of the
>> thread count and transaction rate should eliminate them.
> Agreed.
>> Only if the delays are large are the results less meaningful, though
>> they can still show the base load at which the server starts to slow
>> down.
> That would be a mistake. It assumes that CO has something to do with load, and with servers
"slowing down". CO is more often a result of accumulated work amounts (a pay the piper effect),
or of cosmic noise.
> In the real world, stalls are NOT a result of the server slowing down. They are a result
of the server completely stalling in order to "take care of something". E.g. a quantum lost
to another thread, or a flushing or a journal, or a garbage collection. These have no more
than a loose, non-dominant relationship with load. E.g. In most software systems, max response
times seen at very low loads will usually be dramatically higher than "typical" response times
at a much higher load (where a server would be "slower"). The rate at which stalls happen
may be affected by load (sometimes increasing in frequency when load grows, and sometimes
decreasing in frequency).
>>> If you can report "how much" CO you saw,
>> As I wrote originally, the CTT would report the time diffence (i.e. delay).
>>> it may help a bit in determining how bad the data is, and how the stats should
be treated by the reader. E.g. if you know that CO totaling some amount of time X in a test
of length Y had occured, then you know that any percentile above (100 * (1-X)/Y) is completely
bogus, and should be assumed to be equal to the experienced max value. You can also take the
approach that the the rest of the percentiles should be shifted down by at least  (100 * X
/ Y). e.g. If you had CO that covered only 0.01% of the total test time, that would be relatively
good news.
>> Exactly.
>>> But if you had CO covering 5% of the test time, your measured 99%'ile is actually
the 94%'ile]. Averages are unfortunately anyone's guess when CO is in play and not actually
corrected for.
>>> Once you detect both the existence and the magnitude of CO, correcting for it
is actually pretty easy. The detection of "how much" is the semi-hard part.
>> Detecting the delay using the CTT is trivial; it already has to
>> calculate it to decide how long to wait. A negative wait is obviously
>> a missed start time.
>> Or am I missing something here?
> Yup. You are right. As noted, if CTT reported the delayed time this would be detected.
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message