impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Volker (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (IMPALA-4163) Introduce SORTBY plan hint for insert statements
Date Fri, 26 May 2017 16:25:04 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-4163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Volker resolved IMPALA-4163.
---------------------------------
    Resolution: Won't Fix

> Introduce SORTBY plan hint for insert statements
> ------------------------------------------------
>
>                 Key: IMPALA-4163
>                 URL: https://issues.apache.org/jira/browse/IMPALA-4163
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Frontend
>    Affects Versions: Impala 2.2, Impala 2.3.0, Impala 2.5.0, Impala 2.4.0, Impala 2.6.0,
Impala 2.7.0
>            Reporter: Alexander Behm
>            Assignee: Lars Volker
>              Labels: performance, ramp-up
>
> In order to improve compression and/or the effectiveness of min/max pruning, it is desirable
to control the order in which rows are inserted into table (mostly for Parquet).
> To that end, we should introduce a "sortby" plan hint for insert statements: Example
> {code}
> CREATE TABLE dst (...);
> INSERT INTO dst /*+ sortby(day,hour) */ SELECT * FROM src;
> {code}
> This would produce the following plan:
> SCAN -> SORT(day,hour) -> TABLE SINK
> h4. Syntax and behavior
> {code} INSERT INTO dst /*+ sortby(day,hour) */ SELECT * FROM src; {code}
> - We will not support the legacy-hint style with brackets {code}[sortby(day,hour)]{code}
> - To keep the "clustered" hint strictly separate from the "sortby" hint, it is only legal
to use non-partition columns in "sortby" for HDFS tables.
> - Similarly, it is only legal to mention non-primary-key columns of Kudu tables.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message