hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Koifman (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-14707) ACID: Insert shuffle sort-merges on blank KEY
Date Tue, 10 Jan 2017 02:07:58 GMT

     [ https://issues.apache.org/jira/browse/HIVE-14707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eugene Koifman updated HIVE-14707:
----------------------------------
    Attachment: HIVE-14707.18.patch

> ACID: Insert shuffle sort-merges on blank KEY
> ---------------------------------------------
>
>                 Key: HIVE-14707
>                 URL: https://issues.apache.org/jira/browse/HIVE-14707
>             Project: Hive
>          Issue Type: Bug
>          Components: Transactions
>    Affects Versions: 2.2.0
>            Reporter: Gopal V
>            Assignee: Eugene Koifman
>         Attachments: HIVE-14707.01.patch, HIVE-14707.02.patch, HIVE-14707.03.patch, HIVE-14707.04.patch,
HIVE-14707.05.patch, HIVE-14707.06.patch, HIVE-14707.08.patch, HIVE-14707.09.patch, HIVE-14707.10.patch,
HIVE-14707.11.patch, HIVE-14707.13.patch, HIVE-14707.14.patch, HIVE-14707.16.patch, HIVE-14707.17.patch,
HIVE-14707.18.patch
>
>
> The ACID insert codepath uses a sorted shuffle, while they key used for shuffle is always
0 bytes long.
> {code}
> hive (sales_acid)> explain insert into sales values(1, 2, '3400-0000-0000-009', 1,
null);
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
>       DagId: gopal_20160906172626_80261c4c-79cc-4e02-87fe-3133be404e55:2
>       Edges:
>         Reducer 2 <- Map 1 (SIMPLE_EDGE)
> ...
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: values__tmp__table__2
>                   Statistics: Num rows: 1 Data size: 28 Basic stats: COMPLETE Column
stats: NONE
>                   Select Operator
>                     expressions: tmp_values_col1 (type: string), tmp_values_col2 (type:
string), tmp_values_col3 (type: string), tmp_values_col4 (type: string), tmp_values_col5 (type:
string)
>                     outputColumnNames: _col0, _col1, _col2, _col3, _col4
>                     Statistics: Num rows: 1 Data size: 28 Basic stats: COMPLETE Column
stats: NONE
>                     Reduce Output Operator
>                       sort order: 
>                       Map-reduce partition columns: UDFToLong(_col1) (type: bigint)
>                       Statistics: Num rows: 1 Data size: 28 Basic stats: COMPLETE Column
stats: NONE
>                       value expressions: _col0 (type: string), _col1 (type: string),
_col2 (type: string), _col3 (type: string), _col4 (type: string)
>             Execution mode: vectorized, llap
>             LLAP IO: no inputs
> {code}
> Note the missing "+" / "-" in the Sort Order fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message