hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: Performance Testing
Date Thu, 21 Jun 2012 21:47:16 GMT
Concur. That's ambitious!

On Thu, Jun 21, 2012 at 1:57 PM, Ryan Ausanka-Crues
<ryan@palominolabs.com> wrote:
> Thanks Matt. These are great!
> ---
> Ryan Ausanka-Crues
> Palomino Labs, Inc.
> ryan@palominolabs.com
> (m) 805.242.2486
> On Jun 21, 2012, at 12:36 PM, Matt Corgan wrote:
>> These are geared more towards development than regression testing, but here
>> are a few ideas that I would find useful:
>> * Ability to run the performance tests (or at least a subset of them) on a
>> development machine would help people avoid committing regressions and
>> would speed development in general
>> * Ability to test a single region without heavier weight servers and
>> clusters
>> * Letting the test run with multiple combinations of input parameters
>> (block size, compression, blooms, encoding, flush size, etc, etc).
>> Possibly many combinations that could take a while to run
>> * Output results to a CSV file that's importable to a spreadsheet for
>> sorting/filtering/charting.
>> * Email the CSV file to the user notifying them the tests have finished.
>> * Getting fancier: ability to specify a list of branches or tags from git
>> or subversion as inputs, which would allow the developer to tag many
>> different performance changes and later figure out which combination is the
>> best (all before submitting a patch)
>> On Thu, Jun 21, 2012 at 12:13 PM, Elliott Clark <eclark@stumbleupon.com>wrote:
>>> I actually think that more measurements are needed than just per release.
>>> The best I could hope for would be a four node+ cluster(One master and
>>> three slaves) that for every check in on trunk run multiple different perf
>>> tests.
>>>  - All Reads (Scans)
>>>  - Large Writes (Should test compactions/flushes)
>>>  - Read Dominated with 10% writes
>>> Then every checkin can be evaluated and large regressions can be treated as
>>> bugs.  And with that we can see the difference between the different
>>> versions as well. http://arewefastyet.com/ is kind of the model that I
>>> would love to see.  And I'm more than willing to help where ever needed.
>>> However in reality every night will probably be more feasible.   And Four
>>> nodes is probably not going to happen either.
>>> On Thu, Jun 21, 2012 at 11:38 AM, Andrew Purtell <apurtell@apache.org
>>>> wrote:
>>>> On Wed, Jun 20, 2012 at 10:37 PM, Ryan Ausanka-Crues
>>>> <ryan@palominolabs.com> wrote:
>>>>> I think it makes sense to start by defining the goals for the
>>>> performance testing project and then deciding what we'd like to
>>> accomplish.
>>>> As such, I start by soliciting ideas from everyone on what they would
>>> like
>>>> to see from the project. We can then collate those thoughts and
>>> prioritize
>>>> the different features. Does that sound like a reasonable approach?
>>>> In terms of defining a goal, the fundamental need I see for us as a
>>>> project is to quantify performance from one release to the next, thus
>>>> be able to avoid regressions by noting adverse changes in release
>>>> candidates.
>>>> In terms of defining what "performance" means... well, that's an
>>>> involved and separate discussion I think.
>>>> Best regards,
>>>>   - Andy
>>>> Problems worthy of attack prove their worth by hitting back. - Piet
>>>> Hein (via Tom White)

Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)

View raw message