hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julian Hyde (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12923) CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver groupby_grouping_sets4.q failure
Date Wed, 22 Feb 2017 20:11:44 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15879135#comment-15879135
] 

Julian Hyde commented on HIVE-12923:
------------------------------------

I'm thinking of an alternative solution to CALCITE-1069. Currently, as you know, an Aggregate
with more than one grouping set returns more columns than one with only one grouping set.
We have been arguing about whether there should be 1 extra column (Hive's preference) or N
extra columns (Calcite's preference).

My new proposal is that there should be no extra columns. We make GROUPING into an aggregate
function, and if you want those extra columns you can add calls to GROUPING.

If the row type of Aggregate is same regardless of the number of grouping sets, it will simplify
a bunch of things. For example, it would be easier to write a rule that pushes down the Filter
"group_id = 2", because we wouldn't have to worry about disappearing columns, and whether
they are used.

[~hsubramaniyan], [~jcamachorodriguez], Would the new proposal be acceptable to Hive?

> CBO: Calcite Operator To Hive Operator (Calcite Return Path): TestCliDriver groupby_grouping_sets4.q
failure
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-12923
>                 URL: https://issues.apache.org/jira/browse/HIVE-12923
>             Project: Hive
>          Issue Type: Sub-task
>          Components: CBO
>            Reporter: Hari Sankar Sivarama Subramaniyan
>            Assignee: Hari Sankar Sivarama Subramaniyan
>         Attachments: HIVE-12923.1.patch, HIVE-12923.2.patch
>
>
> {code}
> EXPLAIN
> SELECT * FROM
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq1
> join
> (SELECT a, b, count(*) from T1 where a < 3 group by a, b with cube) subq2
> on subq1.a = subq2.a
> {code}
> Stack trace:
> {code}
> java.lang.NullPointerException
>         at org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.pruneJoinOperator(ColumnPrunerProcFactory.java:1110)
>         at org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory.access$400(ColumnPrunerProcFactory.java:85)
>         at org.apache.hadoop.hive.ql.optimizer.ColumnPrunerProcFactory$ColumnPrunerJoinProc.process(ColumnPrunerProcFactory.java:941)
>         at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90)
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105)
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89)
>         at org.apache.hadoop.hive.ql.optimizer.ColumnPruner$ColumnPrunerWalker.walk(ColumnPruner.java:172)
>         at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:120)
>         at org.apache.hadoop.hive.ql.optimizer.ColumnPruner.transform(ColumnPruner.java:135)
>         at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:237)
>         at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10176)
>         at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:229)
>         at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
>         at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:74)
>         at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:239)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:472)
>         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:312)
>         at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1168)
>         at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1256)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1094)
>         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1082)
>         at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:233)
>         at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:184)
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:400)
>         at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336)
>         at org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1129)
>         at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1103)
>         at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:10444)
>         at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_groupby_grouping_sets4(TestCliDriver.java:3313)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message