hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pengcheng Xiong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15637) Hive/Druid integration: wrong semantics of groupBy query limit with granularity
Date Sat, 25 Mar 2017 20:14:41 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941909#comment-15941909
] 

Pengcheng Xiong commented on HIVE-15637:
----------------------------------------

I am deferring this to Hive 3.0 as we are going to cut the first RC and it is not marked as
blocker. Please feel free to commit to the branch if this can be resolved before the release.

> Hive/Druid integration: wrong semantics of groupBy query limit with granularity
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-15637
>                 URL: https://issues.apache.org/jira/browse/HIVE-15637
>             Project: Hive
>          Issue Type: Bug
>          Components: Druid integration
>    Affects Versions: 2.2.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Critical
>
> Similar to HIVE-15636, but for GroupBy queries. Limit is applied per granularity unit,
not globally for the query.
> {code:sql}
> SELECT i_brand_id, floor_day(`__time`), max(ss_quantity), sum(ss_wholesale_cost) as s
> FROM store_sales_sold_time_subset
> GROUP BY i_brand_id, floor_day(`__time`)
> ORDER BY s
> LIMIT 10;
> OK
> Plan optimized by CBO.
> Stage-0
>   Fetch Operator
>     limit:-1
>     Stage-1
>       Map 1 vectorized
>       File Output Operator [FS_4]
>         Select Operator [SEL_3] (rows=15888 width=0)
>           Output:["_col0","_col1","_col2","_col3"]
>           TableScan [TS_0] (rows=15888 width=0)
>             tpcds_druid_10@store_sales_sold_time_subset,store_sales_sold_time_subset,Tbl:PARTIAL,Col:NONE,Output:["i_brand_id","__time","$f2","$f3"],properties:{"druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_tpcds_ss_sold_time_subset\",\"granularity\":\"DAY\",\"dimensions\":[\"i_brand_id\"],\"limitSpec\":{\"type\":\"default\",\"limit\":10,\"columns\":[{\"dimension\":\"$f3\",\"direction\":\"ascending\"}]},\"aggregations\":[{\"type\":\"longMax\",\"name\":\"$f2\",\"fieldName\":\"ss_quantity\"},{\"type\":\"doubleSum\",\"name\":\"$f3\",\"fieldName\":\"ss_wholesale_cost\"}],\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"]}","druid.query.type":"groupBy"}
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message