impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (IMPALA-5343) Sort by Column(s) added as part of inserting into Kudu table is incorrect
Date Mon, 22 May 2017 19:20:04 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matthew Jacobs resolved IMPALA-5343.
------------------------------------
       Resolution: Not A Problem
    Fix Version/s: Impala 2.9.0

The plan and sort is correct, the reason the "KuduPartition" expr is there is because multiple
partitions end up at a given sink fragment, and we want the rows inserted to kudu to be per-partition
and then ordered by PK.

> Sort by Column(s) added as part of inserting into Kudu table is incorrect 
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-5343
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5343
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Mostafa Mokhtar
>            Assignee: Thomas Tauber-Marshall
>            Priority: Critical
>              Labels: kudu
>             Fix For: Impala 2.9.0
>
>
> The planner is including the KuduPartition(PARTITION_COLUMN) as part of the columns included
in the sort by clause, The Sort should match the columns as in the primary key.
> Plan
> {code}
> Query: explain insert into lineitem_kudu_ts  select * from lineitem_kudu
> | INSERT INTO KUDU [scan_primitives_tpch_3tb.lineitem_kudu_ts]                      
                                                                                         
   |
> | |                                                                                 
                                                                                         
   |
> | 02:SORT                                                                           
                                                                                         
   |
> | |  order by: KuduPartition(scan_primitives_tpch_3tb.lineitem_kudu.l_orderkey) ASC NULLS
LAST, l_shipdate ASC NULLS LAST, l_orderkey ASC NULLS LAST, l_linenumber ASC NULLS LAST |
> | |                                                                                 
                                                                                         
   |
> | 01:EXCHANGE [KUDU(KuduPartition(scan_primitives_tpch_3tb.lineitem_kudu.l_orderkey))]
                                                                                         
 |
> | |                                                                                 
                                                                                         
   |
> | 00:SCAN KUDU [scan_primitives_tpch_3tb.lineitem_kudu]                             
                                                                                         
   |
> {code}
> DDL 
> {code}
> [vd1302.halxg.cloudera.com:21000] > show create table scan_primitives_tpch_3tb.lineitem_kudu_ts;
> Query: show create table scan_primitives_tpch_3tb.lineitem_kudu_ts
>  CREATE TABLE scan_primitives_tpch_3tb.lineitem_kudu_ts (                           
                    
>    l_shipdate STRING NOT NULL ENCODING DICT_ENCODING COMPRESSION LZ4,               
                    
>    l_orderkey BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                 
                    
>    l_linenumber BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,               
                    
>    l_partkey BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                  
                    
>    l_suppkey BIGINT NOT NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                  
                    
>    l_quantity DOUBLE NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                     
                    
>    l_extendedprice DOUBLE NULL ENCODING PLAIN_ENCODING COMPRESSION LZ4,             
                    
>    l_discount DOUBLE NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                     
                    
>    l_tax DOUBLE NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                          
                    
>    l_returnflag STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                 
                    
>    l_linestatus STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                 
                    
>    l_commitdate TIMESTAMP NULL ENCODING BIT_SHUFFLE COMPRESSION LZ4,                
                    
>    l_receiptdate STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                
                    
>    l_shipinstruct STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,               
                    
>    l_shipmode STRING NULL ENCODING DICT_ENCODING COMPRESSION LZ4,                   
                    
>    l_comment STRING NULL ENCODING PLAIN_ENCODING COMPRESSION LZ4,                   
                    
>    PRIMARY KEY (l_shipdate, l_orderkey, l_linenumber)                               
                    
>  )                                                                                  
                    
>  PARTITION BY HASH (l_orderkey) PARTITIONS 140                                      
                    
>  STORED AS KUDU                                                                     
                    
>  TBLPROPERTIES ('kudu.master_addresses'='vd1301.halxg.cloudera.com:7051,vd1128.halxg.cloudera.com:7051')

> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message