jmeter-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robin D. Wilson" <>
Subject RE: Establishing baseline metrics
Date Mon, 01 Jul 2013 20:32:49 GMT
I'm thinking I look at performance testing differently than a lot of people... For me, the
objective of performance testing is to
establish what your system _can_ do, not what you need to accomplish. So when you are setting
up your tests, you are trying to drive
your systems at maximum capacity for some extended period of time. Then you measure that capacity
as your 'baseline'.

For every subsequent release of your code, you measure it against the 'baseline', and determine
whether the code got faster or
slower. If you determine that the slower (or faster) response is acceptable to your end users
(because you were nowhere near the
user's acceptable standard), you can reset your baseline to that standard. If your slower
standard is encroaching on the usability
of the system - you can declare that baseline as the minimum spec, and then fail any code
that exceeds that standard.

As for how you determine what is acceptable to a 'user', that can be handled in a number of
ways - without actually improving the
'real' performance of the system. Consider a web page that loads a bunch of rows of data in
a big table. For most users, if you can
start reading the table within 1-2 seconds, that is acceptable for a system's performance.
But if there are hundreds of rows of
data, you would not need to load _all_ the rows within 1-2 seconds to actually meet their
performance criteria. You only need to
load enough rows that the table fills the browser - so they can start reading - within the
1-2 second period. JMeter cannot really
measure this timing, it can only measure the 'overall response time' (indeed, I don't know
any testing tool that can do it). So
trying to define a performance benchmark in terms of what 'users' experience is really difficult,
and nearly useless (to me anyway).

I look at performance testing as a way to cross-check my development team against the perpetual
tendency to gum-up the code and slow
things down. So in order to make the testing effective for the developers, I need to perf
test _very_specific_ things. Trying to
performance test the "system" as a whole is nearly an impossible task - not only because there
are so many variables that influence
the tests, but precisely because "all of those variables" make it impossible to debug which
one causes the bottleneck when there is
a change in performance from one release to the next. (Have you ever sent your programmers
off to 'fix' a performance problem that
turned out to be caused by an O/S update on your server? I have...)

Instead, we create performance tests that test specific functional systems. That is, the "login"
perf test. The "registration" perf
test. The "..." perf test. Each one of these tests is run independently, so that when we encounter
a slower benchmark - we can tell
the developers immediately where to concentrate their efforts in fixing the problem. (We also
monitor all parts of the system (CPU,
IO, Database Transactions (reads, writes, full table scans, etc.) from all servers involved
in the test. The goal is not to simulate
'real user activity', it is to max out the capacity of at least 1 of the servers in the test
(specifically the one executing the
'application logic'). If we max out that one server, we know that our 'benchmark' is the most
we can expect of a single member of
our cluster of machines. (We also test a cluster of 2 machines - and measure the fall-off
in capacity between a 1-member cluster and
2-member cluster, this gives us an idea of how much impact our 'clustering' system has on
performance as well.) I suppose you could
say that I look at it as if, we measure the 'maximum capacity', and so long as the number
of users doesn't exceed that - we will
perform OK.

We do run some 'all-encompassing' system tests as well, but those are more for 'stress' testing
than for performance benchmarking.
We are specifically looking for things that start to break-down after hours of continuous
operation at peak capacity. So we monitor
error logs and look to make sure that we aren't throwing errors while under stress.

The number one thing to keep in mind about performance testing is that you have to use 'real
data'. We actually download our
production database every weekend, and strip out any 'personal information' (stuff that we
protect in our production environment) by
either nulling it out, or replacing it with bogus data. This allows us to run our performance
tests against a database that has 100s
of millions of rows of data. Nearly all of our performance 'bugs' have been caused by poor
data handling in the code (SQL requests
that don't use indices (causing a full table scan), badly formed joins, fetching a few rows
of data and then looping through them in
the code (when the 'few rows of data' from your 'dev' environment become 100,000 rows with
the production data, this tends to bog
the code down a lot), etc.). So if you are testing with 'faked' data, odds are good you will
miss a lot of performance issues - no
matter what form of performance testing you use.

I will say that we have served over 130M web pages in 1 month, using only 5 servers (4 tomcats,
and 1 DB server)... Those pages
represented about 10X in "GET" requests to our servers... 

Robin D. Wilson
Sr. Director of Web Development
KingsIsle Entertainment, Inc.

-----Original Message-----
From: nmq [] 
Sent: Monday, July 01, 2013 2:33 PM
To: JMeter Users List
Subject: Establishing baseline metrics

Hi all

This is not a JMeter specific questions but since this user list comprises of experts in performance
testing, I figured it would be
a good place to ask this question.

My question is how do you establish baselines for a website's performance if you do not have
any historic data?  Lets say this is a
new website and its for a limited number of customers.

How do you determine what should be the number of concurrent users you should simulate.

Lets say the executives say off at the top of their heads, that the maximum number of concurrent
users would be 50 at peak times.
Does that mean I should not go beyond 50 or should I still do tests with a higher number?

How can I go about establishing baselines for page load times, if I do not have any historic
data and have no industry benchmarks or
competitor data.

Would it make sense to say let's see how the website is doing throughout the development phase
and establish our baseline using the
current response times?

I would appreciate any input.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message