hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-21934) Materialized view on top of Druid not pushing everything
Date Fri, 12 Jul 2019 01:46:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-21934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16883457#comment-16883457
] 

Hive QA commented on HIVE-21934:
--------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  0s{color} |
{color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 46s{color} | {color:blue}
Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m  5s{color}
| {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  6s{color} |
{color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 39s{color}
| {color:green} master passed {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  4m  3s{color} | {color:blue}
ql in master has 2255 extant Findbugs warnings. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  2s{color} |
{color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 28s{color} | {color:blue}
Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 25s{color}
| {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  7s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  7s{color} | {color:green}
the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 40s{color}
| {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  0s{color} | {color:red}
The patch has 95 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>.
Refer https://git-scm.com/docs/git-apply {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 12s{color} |
{color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  2s{color} |
{color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 14s{color}
| {color:green} The patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 25m 23s{color} | {color:black}
{color} |
\\
\\
|| Subsystem || Report/Notes ||
| Optional Tests |  asflicense  javac  javadoc  findbugs  checkstyle  compile  |
| uname | Linux hiveptest-server-upstream 3.16.0-4-amd64 #1 SMP Debian 3.16.43-2+deb8u5 (2017-09-19)
x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /data/hiveptest/working/yetus_PreCommit-HIVE-Build-17991/dev-support/hive-personality.sh
|
| git revision | master / 7cfe729 |
| Default Java | 1.8.0_111 |
| findbugs | v3.0.0 |
| whitespace | http://104.198.109.242/logs//PreCommit-HIVE-Build-17991/yetus/whitespace-eol.txt
|
| modules | C: ql itests U: . |
| Console output | http://104.198.109.242/logs//PreCommit-HIVE-Build-17991/yetus.txt |
| Powered by | Apache Yetus    http://yetus.apache.org |


This message was automatically generated.



> Materialized view on top of Druid not pushing everything
> --------------------------------------------------------
>
>                 Key: HIVE-21934
>                 URL: https://issues.apache.org/jira/browse/HIVE-21934
>             Project: Hive
>          Issue Type: Improvement
>          Components: Druid integration, Materialized views
>            Reporter: slim bouguerra
>            Assignee: Jesus Camacho Rodriguez
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-21934.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The title is not very informative, but examples hopefully are.
> this is the plan with the view
> {code}
> explain SELECT MONTH(`dates_n1`.`__time`) AS `mn___time_ok`,
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`dates_n1`.`__time`) AS `yr___time_ok`
> FROM `mv_ssb_100_scale`.`lineorder_n0` `lineorder_n0`
> JOIN `mv_ssb_100_scale`.`dates_n1` `dates_n1` ON (`lineorder_n0`.`lo_orderdate` = `dates_n1`.`d_datekey`)
> JOIN `mv_ssb_100_scale`.`customer_n1` `customer_n1` ON (`lineorder_n0`.`lo_custkey` =
`customer_n1`.`c_custkey`)
> JOIN `mv_ssb_100_scale`.`supplier_n0` `supplier_n0` ON (`lineorder_n0`.`lo_suppkey` =
`supplier_n0`.`s_suppkey`)
> JOIN `mv_ssb_100_scale`.`ssb_part_n0` `ssb_part_n0` ON (`lineorder_n0`.`lo_partkey` =
`ssb_part_n0`.`p_partkey`)
> GROUP BY MONTH(`dates_n1`.`__time`),
> CAST((MONTH(`dates_n1`.`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`dates_n1`.`__time`)
> INFO : Starting task [Stage-3:EXPLAIN] in serial mode
> INFO : Completed executing command(queryId=sbouguerra_20190627113101_1493ee87-0288-4e30-b53c-0ee729ce3977);
Time taken: 0.005 seconds
> INFO : OK
> +----------------------------------------------------+
> | Explain |
> +----------------------------------------------------+
> | Plan optimized by CBO. |
> | |
> | Vertex dependency in root stage |
> | Reducer 2 <- Map 1 (SIMPLE_EDGE) |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Stage-1 |
> | Reducer 2 vectorized, llap |
> | File Output Operator [FS_13] |
> | Select Operator [SEL_12] (rows=300018951 width=38) |
> | Output:["_col0","_col1","_col2","_col3"] |
> | Group By Operator [GBY_11] (rows=300018951 width=38) |
> | Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(VALUE._col0)"],keys:KEY._col0,
KEY._col1, KEY._col2 |
> | <-Map 1 [SIMPLE_EDGE] vectorized, llap |
> | SHUFFLE [RS_10] |
> | PartitionCols:_col0, _col1, _col2 |
> | Group By Operator [GBY_9] (rows=600037902 width=38) |
> | Output:["_col0","_col1","_col2","_col3"],aggregations:["sum(1)"],keys:_col0, _col1,
_col2 |
> | Select Operator [SEL_8] (rows=600037902 width=38) |
> | Output:["_col0","_col1","_col2"] |
> | TableScan [TS_0] (rows=600037902 width=38) |
> | mv_ssb_100_scale@ssb_mv_druid_100,ssb_mv_druid_100,Tbl:COMPLETE,Col:NONE,Output:["vc"],properties:\{"druid.fieldNames":"vc","druid.fieldTypes":"timestamp","druid.query.json":"{\"queryType\":\"scan\",\"dataSource\":\"mv_ssb_100_scale.ssb_mv_druid_100\",\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"],\"virtualColumns\":[{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"\\\"__time\\\"\",\"outputType\":\"LONG\"}],\"columns\":[\"vc\"],\"resultFormat\":\"compactedList\"}","druid.query.type":"scan"}
|
> | |
> +----------------------------------------------------+
>  
> {code}
> if i use a simple druid table without MV 
> {code}
> explain SELECT MONTH(`__time`) AS `mn___time_ok`,
> CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT) AS `qr___time_ok`,
> SUM(1) AS `sum_number_of_records_ok`,
> YEAR(`__time`) AS `yr___time_ok`
> FROM `druid_ssb.ssb_druid_100`
> GROUP BY MONTH(`__time`),
> CAST((MONTH(`__time`) - 1) / 3 + 1 AS BIGINT),
> YEAR(`__time`);
> {code}
> {code}
> +----------------------------------------------------+
> | Explain |
> +----------------------------------------------------+
> | Plan optimized by CBO. |
> | |
> | Stage-0 |
> | Fetch Operator |
> | limit:-1 |
> | Select Operator [SEL_1] |
> | Output:["_col0","_col1","_col2","_col3"] |
> | TableScan [TS_0] |
> | Output:["extract_month","vc","$f3","extract_year"],properties:\{"druid.fieldNames":"extract_month,vc,extract_year,$f3","druid.fieldTypes":"int,bigint,int,bigint","druid.query.json":"{\"queryType\":\"groupBy\",\"dataSource\":\"druid_ssb.ssb_druid_100\",\"granularity\":\"all\",\"dimensions\":[{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_month\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"M\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}},\{\"type\":\"default\",\"dimension\":\"vc\",\"outputName\":\"vc\",\"outputType\":\"LONG\"},\{\"type\":\"extraction\",\"dimension\":\"__time\",\"outputName\":\"extract_year\",\"extractionFn\":{\"type\":\"timeFormat\",\"format\":\"yyyy\",\"timeZone\":\"America/New_York\",\"locale\":\"en-US\"}}],\"virtualColumns\":[\{\"type\":\"expression\",\"name\":\"vc\",\"expression\":\"CAST(((CAST((timestamp_extract(\\\"__time\\\",'MONTH','America/New_York')
- 1), 'DOUBLE') / CAST(3, 'DOUBLE')) + CAST(1, 'DOUBLE')), 'LONG')\",\"outputType\":\"LONG\"}],\"limitSpec\":\{\"type\":\"default\"},\"aggregations\":[\{\"type\":\"longSum\",\"name\":\"$f3\",\"expression\":\"1\"}],\"intervals\":[\"1900-01-01T00:00:00.000Z/3000-01-01T00:00:00.000Z\"]}","druid.query.type":"groupBy"}
|
> | |
> +----------------------------------------------------+
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Mime
View raw message