db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Pendleton (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-3002) Add support for GROUP BY ROLLUP
Date Wed, 30 Sep 2009 04:44:32 GMT

    [ https://issues.apache.org/jira/browse/DERBY-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760830#action_12760830

Bryan Pendleton commented on DERBY-3002:

On 1-sep-2009, Dag made two observations (paraphrased, and shortened):

1) When a CAST node is used in a certain query, the wrong results are returned
2) When a sub-query is used, the nullability handling of rolled-up columns is confused

I believe that I have figured out the first problem: in my patch, there are certain
situations where, during the processing of a row of data, we complete the
processing of multiple rolled-up groups. For example, if you are rolling up
sales data by DIVISION, QUARTER, and MONTH, and you get a row for a new DIVISION,
the accumulated data for the previous MONTH, QUARTER, and DIVISION are now
complete. Internally, I am holding these completed-and-not-yet-returned rows in
the "finishedResults" array, and returning them one-row-at-a-time when requested.

However, I was not correctly calling setCurrentRow() when returning a saved row from
the finishedResults array, which meant that operations at the next level, such as
projection processing, was accessing the wrong data when it referred to the current row.

I've modified the patch to call setCurrentRow appropriately, and the wrong results are
no longer returned.

Dag's other question remains open to me, and I'm not sure what's going on. It seems
to me that the ROLLUP feature can cause NULL values to be returned in non-NULL-able
columns, which is a troubling behavior. I believe this behavior is mentioned in Jim Gray's
original DataCube research paper, where he proposed a different handling of the placeholder
values other than NULL.

I will need to spend some time with the SQL Standard to understand how the standard
deals with this ability of the ROLLUP feature to produce NULL values where they might
otherwise not be expected.

I'll try to post an updated patch with the setCurrentRow fix soon.

Meanwhile, Knut, I haven't forgotten about your feedback regarding the fact that this
patch implements only a subset of the ROLLUP feature as described in the standard.
I was definitely aware that I was implementing only part of the feature, but I agree with
you that we need to have a good awareness of what parts of the overall standard we did
and did not implement, and make sure that is clear in both the code and the documentation.

Thanks again to everybody for their continued feedback and support. 

> Add support for GROUP BY ROLLUP
> -------------------------------
>                 Key: DERBY-3002
>                 URL: https://issues.apache.org/jira/browse/DERBY-3002
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions:
>            Reporter: Bryan Pendleton
>            Assignee: Bryan Pendleton
>            Priority: Minor
>         Attachments: fixWhiteSpace.diff, IncludesASimpleTest.diff, passesRegressionTests.diff,
prototypeChangeNoTests.diff, rewriteGroupByRS.diff, rollupNullability.diff, UpdateToTrunkSep2009.diff,
useAggregateObserver.diff, useLookahead.diff, withDagsCastTest.diff
> Provide an implementation of the ROLLUP form of multi-dimensional grouping according
to the SQL standard.
> See http://wiki.apache.org/db-derby/OLAPRollupLists for some more detailed information
about this aspect of the SQL standard.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message