db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Pendleton (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-3002) Add support for GROUP BY ROLLUP
Date Sat, 27 Jun 2009 06:44:47 GMT

    [ https://issues.apache.org/jira/browse/DERBY-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724810#action_12724810

Bryan Pendleton commented on DERBY-3002:

Hi Knut, thanks again for having a look at the patch. Your help is much appreciated!

I think that writing a simple GROUP BY benchmark would be an excellent next step. Is there
such a benchmark readily available? If not, I'll put one together.

The changes to the run-time statistics are precisely because the algorithm has changed: instead
of the sort-observer technique, we now always compute aggregates in-line, so the sort step
no longer "collapses" groups and the same number of rows are output from the sort as are input,
whereas before the sorter would perform the grouping as a side effect, and the number of output
rows was equal to the number of groups. So yes the behavior change in the statistics is expected.

The in-memory hash tables are only used for DISTINCT aggregates, and will hold one copy of
every unique value of that particular column in that particular group. They could indeed run
out of memory if the distribution of data was just right. I don't have a good intuition for
(a) how often DISTINCT aggregates are used, (b) how many distinct values there tend to be
per group, and (c) what sort of data types are used for DISTINCT aggregates. In my benchmark,
I can try to construct a DISTINCT aggregate which uses an inordinate amount of memory, and
we can see how it behaves.

> Add support for GROUP BY ROLLUP
> -------------------------------
>                 Key: DERBY-3002
>                 URL: https://issues.apache.org/jira/browse/DERBY-3002
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions:
>            Reporter: Bryan Pendleton
>            Assignee: Bryan Pendleton
>            Priority: Minor
>         Attachments: fixWhiteSpace.diff, IncludesASimpleTest.diff, passesRegressionTests.diff,
prototypeChangeNoTests.diff, rewriteGroupByRS.diff, rollupNullability.diff, useLookahead.diff
> Provide an implementation of the ROLLUP form of multi-dimensional grouping according
to the SQL standard.
> See http://wiki.apache.org/db-derby/OLAPRollupLists for some more detailed information
about this aspect of the SQL standard.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message