db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bryan Pendleton (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-3002) Add support for GROUP BY ROLLUP
Date Fri, 29 May 2009 20:31:45 GMT

    [ https://issues.apache.org/jira/browse/DERBY-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714583#action_12714583
] 

Bryan Pendleton commented on DERBY-3002:
----------------------------------------

I agree with your analysis, and will try to work on such an implementation.

Note that, as far as I can tell, SELECT NULL, COUNT(*) FROM T is not actually legal in Derby.
However, the roughly similar queries SELECT CAST(NULL AS INT), COUNT(*) FROM T and
SELECT 'NULL', COUNT(*) FROM T are legal, and, as you say, return the result set {"NULL",
0}

I'm not sure where I got this "semantically equivalent to a union" terminology from when I
wrote
the wiki page. It's certainly true that the original Jim Gray "Data Cube" paper from 1996
uses the
technique of defining ROLLUP in terms of an equivalent UNION query, but I don't think that's
how the actual SQL Standard ended up defining ROLLUP. However, I find the SQL Standard hard
to read.

In the SQL Standard, ROLLUP is defined in terms of an equivalent GROUPING SETS query, and
GROUPING SETS includes the (new) syntax "GROUP BY ()". NOTE 137 of the SQL 2003 spec, in section
7.9, says:

  The result of the transform is to replace RL with a <grouping sets specification>
that contains
  a <grouping set> for every initial sublist of the <ordinary grouping set list>
of the <rollup list>,
  obtained by dropping <ordinary grouping set>s from the right, one by one, and concatenating
  each <ordinary grouping set list> so obtained. The <empty grouping set> is regarded
as the
  shortest such initial sublist.

So GROUP BY ROLLUP(a) is (internally) transformed into GROUP BY GROUPING SETS( (a), () )

Then, elsewhere in 7.9, in the "General Rules" section, it says:

  If there are no grouping columns, then the result of the <group by clause> is the
grouped table
  consisting of T as its only group.

So I *think* that the SQL Standard says that

  SELECT a, count(*) FROM T GROUP BY ROLLUP(a)

should be equivalent to:

  SELECT a, COUNT(*) FROM T GROUP BY (a)
    UNION
  SELECT CAST ( NULL AS INT), COUNT(*) FROM T GROUP BY ()

So, anyway, I think that the wiki page is somewhat inaccurate in this respect, and doesn't
truly
obey the SQL Standard, because as you say the wiki page says that the last subquery in the
union has no group by, but the SQL Standard says that the last subquery in the union has GROUP
BY ().

I still think that your analysis is correct, and will try to make the implementation comply.



> Add support for GROUP BY ROLLUP
> -------------------------------
>
>                 Key: DERBY-3002
>                 URL: https://issues.apache.org/jira/browse/DERBY-3002
>             Project: Derby
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 10.4.1.3
>            Reporter: Bryan Pendleton
>            Assignee: Bryan Pendleton
>            Priority: Minor
>         Attachments: fixWhiteSpace.diff, IncludesASimpleTest.diff, prototypeChangeNoTests.diff,
rollupNullability.diff, useLookahead.diff
>
>
> Provide an implementation of the ROLLUP form of multi-dimensional grouping according
to the SQL standard.
> See http://wiki.apache.org/db-derby/OLAPRollupLists for some more detailed information
about this aspect of the SQL standard.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message