db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Pendleton (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-4363) Add simple benchmark for measuring GROUP BY performance
Date Mon, 07 Sep 2009 15:40:57 GMT

     [ https://issues.apache.org/jira/browse/DERBY-4363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Bryan Pendleton updated DERBY-4363:

    Attachment: multiColumnBenchmark.diff

Attached is 'multiColumnBenchmark, an enhanced version of the GroupByClient proposal
which can now generate a richer variety of GROUP BY statements.

It also only executes a single statement per run, since I agree with the
observation that it is hard to interpret the results of running a mixture
of statements in the same run.

I put a lot of comments into the GroupByClient header which should explain
how to invoke the benchmark to run a richer set of statements.

I gave getLoadOpt package visibility so that the GroupByClient could
interrogate the -load_opts settings in a more convenient fashion.

Continued suggestions and comments would be greatly appreciated.

Soon, I hope to find the time to run this benchmark against the current trunk,
as well as against the DERBY-3002 patch proposal, to get a first set of
numbers to explore the overall performance characteristics in a coarse fashion.

I'm hoping it will be sufficient to perform, say, 5 different GROUP BY statements
against each version of the code, at scales 10 thousand, 100 thousand, and
250 thousand rows. That will give us 15 numbers for each branch, and
maybe we can see some results from that data. 

I should be able to post those runs as a "script" of 18 perf.clients.Runner statements
to be run in sequence against each code branch.

> Add simple benchmark for measuring GROUP BY performance
> -------------------------------------------------------
>                 Key: DERBY-4363
>                 URL: https://issues.apache.org/jira/browse/DERBY-4363
>             Project: Derby
>          Issue Type: Sub-task
>          Components: Test, Tools
>            Reporter: Bryan Pendleton
>            Assignee: Bryan Pendleton
>            Priority: Minor
>         Attachments: multiColumnBenchmark.diff, simpleBenchmark.diff
> As part of ROLLUP implementation (DERBY-3002), it will be helpful to be able to measure
the performance of GROUP BY.
> Using the o.a.dT.perf.clients framework, this sub-task proposes to add a GroupByClient
to the performance runner
> library; the GroupByClient will run GROUP BY statements against the Wisconsin benchmark

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message