Return-Path: X-Original-To: apmail-jmeter-user-archive@www.apache.org Delivered-To: apmail-jmeter-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5C535F35E for ; Tue, 6 Aug 2013 05:43:23 +0000 (UTC) Received: (qmail 46414 invoked by uid 500); 6 Aug 2013 05:43:23 -0000 Delivered-To: apmail-jmeter-user-archive@jmeter.apache.org Received: (qmail 45653 invoked by uid 500); 6 Aug 2013 05:43:16 -0000 Mailing-List: contact user-help@jmeter.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "JMeter Users List" Delivered-To: mailing list user@jmeter.apache.org Received: (qmail 45631 invoked by uid 99); 6 Aug 2013 05:43:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Aug 2013 05:43:12 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kirk.pepperdine@gmail.com designates 209.85.214.44 as permitted sender) Received: from [209.85.214.44] (HELO mail-bk0-f44.google.com) (209.85.214.44) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Aug 2013 05:43:05 +0000 Received: by mail-bk0-f44.google.com with SMTP id mz10so1263135bkb.3 for ; Mon, 05 Aug 2013 22:42:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=/lwyvKeDj+7zvEffnzRUrGyiG364McZS0qBbvt9jZzQ=; b=auSjaueGUGSo7HXzZrMM6sl2wET4q92H92EwtAM/KI04lYIRpxaWR0/t3p2Bd8XGYW V7pFOI8+qV2/Asr1bQYC72FFDI6OhTRH3wa/ojUIYotrT+S/wfnUSGbBGH7Y6eLFMYSp WhdEhBeubYNsAWGk3LgvvtRpskEJXJkeqz6VHEiKtgq0+qTRPfqI64HPp76sE5u8sxK9 sF4pPo9TgBynOHQNpeOKqcYTbOQ2oHi0ifw4dxlpPkylBj5yGOEWa4P8cRlszPsFmvfB TvrxSIajtpxkl85XTwiCBIuKxQerZxbtIL9CkpBJ+ZXW9FJGhwsTtZeArRa45oIP6Acq JRFQ== X-Received: by 10.204.232.207 with SMTP id jv15mr3267931bkb.39.1375767765138; Mon, 05 Aug 2013 22:42:45 -0700 (PDT) Received: from surfer-172-30-2-120-hotspot.maxxarena.de ([85.182.134.250]) by mx.google.com with ESMTPSA id if11sm394288bkc.15.2013.08.05.22.42.43 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Mon, 05 Aug 2013 22:42:44 -0700 (PDT) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: Coordinated Omission From: Kirk Pepperdine In-Reply-To: Date: Tue, 6 Aug 2013 07:42:49 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <5C9996F6-E03F-4849-B914-826EF83A6820@gmail.com> To: "JMeter Users List" X-Mailer: Apple Mail (2.1508) X-Virus-Checked: Checked by ClamAV on apache.org Hi Sebb, On 2013-08-06, at 2:54 AM, sebb wrote: > On 3 August 2013 21:45, Kirk Pepperdine = wrote: >> Hi all, >>=20 >> To all that maybe interested. Gil Tene (from Azul Systems) has = started a thread on Coordinated Omissions. it's his name for a problem = that I noted in JMeter a while back and for which the current fix is the = start threads on demand check box. The problems Gil describes go much = deeper than the problems that are due to JMeters current threading model = and lack of event based schedulers (which would provide at least a 10x = boost in scalability and make it less likely that JMeter will work to = hide bottlenecks in your application). The effect of CO is that the test = harness becomes like a ref making a bad call at a critical point in the = game. >=20 > This is dependent on the test load; not all tests are affected by the = problem. I would agree that not all load tests are affected by the problem though = IME, most are. >=20 >> Anyways, I shan't repeat the contents of the thread here. You can = find it at mechanical-sympathy@googlegroups.com, a mailing list started = by Martin Thompson, well known for his work on the Disruptor framework. >=20 > Interesting thread. > As it points out this affects many testing systems; JMeter is not > alone in being affected. Absolutely true, never intended to suggest that it was. Of the load = testing tools out there, JMeter continues to rank very highly because of = it's reachability. I can teach a group of people how to execute rich = load tests using JMeter in about an hour. This is not true of any of the = other tooling that is out there. >=20 > As I understand it, the problem occurs when one or more samples are > delayed because the previous sample did not complete in time. > Assuming that the slowdown is caused by the system under test (SUT), > the samples that cannot be sent are also likely to have taken longer > than usual. > So one overlong sample response can hide several others that would > have occurred, thus affecting the statistics, particularly the high > percentile stats. I think the premise of the discussion is that something happens in the = harness that prevents it from firing when it should. WIth JMeter having = a long response time from the app would also contribute to this problem. = In JMeter, a change in the threading model would help prevent this = problem and would make JMeter far more scalable. Currently what happens = is a thread picks up a script (ThreadGroup) and executes it. The problem = is that the script is a far too granular unit of work to give to a = thread. That problem becomes magnified with you loop over the script. = This created the 1 thread 1 user dependency that I've mentioned in the = past. It is this dependency that IMHO, limits the scalability of JMeter. = To break this dependency one would need to offer the thread a much less = granular task than an entire script. So, instead of getting a script = what is a thread only got a sampler. What if the samplers were in an = event heap that was sorted by the time at which that sampler was to be = triggered. Threads would then pick up the event, fire it and then = calculate and inject the heap with the next event as directed by the = script. This way instead of having a thread tied down it could be made = available to trigger the next sample. This would allow the test to = maintain a much more consistent load on the server by making it less = likely that the test is throttled by side effects in the load injector. >=20 > To avoid this, we need to ensure that long sample responses do not > prevent the generation of the next sample at the correct time. >=20 > There are a couple of ways to mitigate this with the current JMeter > design (which waits for a sample to complete before continuing with > the next). >=20 > 1) Ensure that each thread only needs to send requests at a relatively > low rate, so that slow responses do not use up all the wait time. > This may not be possible with a single JMeter instance, in which case > use one instance to create the bulk of the load (ignoring the issue of > delayed requests), and a second to measure the sample times. This > second instance should have a low transaction rate per thread so slow > responses don't affect the generated load, and should be used to > derive the statistics. CO can occur under low loads as well as high ones. >=20 > 2) Use timeouts to abandon slow responses so that they cannot cause a > slow down (this might not always be suitable). CO is a reporting error. By omitting the data point you will have only = further contributed to CO. >=20 > It should be fairly obvious from the test timings (particularly the > transaction rate) whether this has happened. IME it's not been obvious that this has been happening unless you are = inclined to dig deeper into what is happening with your load test. Many = of the effects are subtile and often can only been when you use a = histogram or some other visualization. Using an average will only bury = this effect. >=20 > It should still be possible to draw some useful conclusions about the > behaviour of the SUT, e.g. what load starts to cause response > slowdown; does the SUT start misbehaving in other ways (reporting > errors etc). I'd agree that you can still draw useful conclusions from some very = flawed benchmarks but the real problem is, how do you know when you can = and when you shouldn't use the results from a bench? The question has = many subtile implications and I've seen teams get it wrong in a few = cases it almost resulted in the project being terminated. Anyways, ever since I've changed all of my testing to never loop I've = been finding it much easier to reliably load an application using = JMeter. Regards, Kirk --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscribe@jmeter.apache.org For additional commands, e-mail: user-help@jmeter.apache.org