db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Pendleton (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-3002) Add support for GROUP BY ROLLUP
Date Sat, 27 Jun 2009 14:42:47 GMT

    [ https://issues.apache.org/jira/browse/DERBY-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724860#action_12724860
] 

Bryan Pendleton commented on DERBY-3002:
----------------------------------------

Indeed, I *am* concerned that there may be some circumstances where the new technique
is less efficient.  An example might be a case where there is a large amount of data to be
processed, but a very small number of groups; in this case, most of the raw data ends up
being discarded during the GROUP BY processing, and the sort-observer technique will
discard the unneeded data sooner, which means it may have a substantial edge over the
new algorithm. However, other circumstances, such as those in which an index exists, and
thus the data can be processed in sorted order automatically, and those in which the data
is relatively small, and thus is not expensive to sort, and those in which there is a lot
of data,
but there are also many different values for the grouping column, may not see much impact
at all.

And, I am just guessing about the performance impacts; I don't know how important
this distinction will be, given the multitude of other things that occur during query processing.

Earlier in the project, I intended to support multiple algorithms. However, I then realized:
 - we'd have to have multiple sets of code for the same functionality
 - we'd have to have some way of determining, at runtime, which implementation to choose.

Both problems seemed quite troubling, the second problem seemed very important,
because if the selection of the appropriate algorithm is based on information about
the size and distribution of the data being processed by the query, then the decision ought
to be made by the optimizer, which appeared like it would dramatically increase the
complexity of this project.

So it would be great if the single implementation was "good enough" for the queries we
expect to run.

I'll try to put a benchmark together and we can see what the results say, and then we'll
have a better idea of how big a problem we have here.

> Add support for GROUP BY ROLLUP
> -------------------------------
>
>                 Key: DERBY-3002
>                 URL: https://issues.apache.org/jira/browse/DERBY-3002
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 10.4.1.3
>            Reporter: Bryan Pendleton
>            Assignee: Bryan Pendleton
>            Priority: Minor
>         Attachments: fixWhiteSpace.diff, IncludesASimpleTest.diff, passesRegressionTests.diff,
prototypeChangeNoTests.diff, rewriteGroupByRS.diff, rollupNullability.diff, useLookahead.diff
>
>
> Provide an implementation of the ROLLUP form of multi-dimensional grouping according
to the SQL standard.
> See http://wiki.apache.org/db-derby/OLAPRollupLists for some more detailed information
about this aspect of the SQL standard.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message