hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Ausanka-Crues <r...@palominolabs.com>
Subject Re: Performance Testing
Date Thu, 21 Jun 2012 20:57:28 GMT
Thanks Matt. These are great!
Ryan Ausanka-Crues
Palomino Labs, Inc.
(m) 805.242.2486

On Jun 21, 2012, at 12:36 PM, Matt Corgan wrote:

> These are geared more towards development than regression testing, but here
> are a few ideas that I would find useful:
> * Ability to run the performance tests (or at least a subset of them) on a
> development machine would help people avoid committing regressions and
> would speed development in general
> * Ability to test a single region without heavier weight servers and
> clusters
> * Letting the test run with multiple combinations of input parameters
> (block size, compression, blooms, encoding, flush size, etc, etc).
> Possibly many combinations that could take a while to run
> * Output results to a CSV file that's importable to a spreadsheet for
> sorting/filtering/charting.
> * Email the CSV file to the user notifying them the tests have finished.
> * Getting fancier: ability to specify a list of branches or tags from git
> or subversion as inputs, which would allow the developer to tag many
> different performance changes and later figure out which combination is the
> best (all before submitting a patch)
> On Thu, Jun 21, 2012 at 12:13 PM, Elliott Clark <eclark@stumbleupon.com>wrote:
>> I actually think that more measurements are needed than just per release.
>> The best I could hope for would be a four node+ cluster(One master and
>> three slaves) that for every check in on trunk run multiple different perf
>> tests.
>>  - All Reads (Scans)
>>  - Large Writes (Should test compactions/flushes)
>>  - Read Dominated with 10% writes
>> Then every checkin can be evaluated and large regressions can be treated as
>> bugs.  And with that we can see the difference between the different
>> versions as well. http://arewefastyet.com/ is kind of the model that I
>> would love to see.  And I'm more than willing to help where ever needed.
>> However in reality every night will probably be more feasible.   And Four
>> nodes is probably not going to happen either.
>> On Thu, Jun 21, 2012 at 11:38 AM, Andrew Purtell <apurtell@apache.org
>>> wrote:
>>> On Wed, Jun 20, 2012 at 10:37 PM, Ryan Ausanka-Crues
>>> <ryan@palominolabs.com> wrote:
>>>> I think it makes sense to start by defining the goals for the
>>> performance testing project and then deciding what we'd like to
>> accomplish.
>>> As such, I start by soliciting ideas from everyone on what they would
>> like
>>> to see from the project. We can then collate those thoughts and
>> prioritize
>>> the different features. Does that sound like a reasonable approach?
>>> In terms of defining a goal, the fundamental need I see for us as a
>>> project is to quantify performance from one release to the next, thus
>>> be able to avoid regressions by noting adverse changes in release
>>> candidates.
>>> In terms of defining what "performance" means... well, that's an
>>> involved and separate discussion I think.
>>> Best regards,
>>>   - Andy
>>> Problems worthy of attack prove their worth by hitting back. - Piet
>>> Hein (via Tom White)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message