hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15268) limit+offset is broken (differently for ACID or not)
Date Wed, 23 Nov 2016 03:30:58 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15688761#comment-15688761
] 

Sergey Shelukhin commented on HIVE-15268:
-----------------------------------------

[~ekoifman] [~jcamachorodriguez] fyi

> limit+offset is broken (differently for ACID or not)
> ----------------------------------------------------
>
>                 Key: HIVE-15268
>                 URL: https://issues.apache.org/jira/browse/HIVE-15268
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>
> I think some part of putting limit on the map side implicitly assumes there is CombineHiveInputFormat;
when splits are not combined, results are incorrect. In fact they are also incorrect for ORC,
although differently, even though it seems like it should combined splits. I didn't fully
investigate.
> IIRC results are correct with text.
> {noformat}
> set hive.fetch.task.conversion=none;
> set hive.support.concurrency=true;
> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
> set hive.exec.dynamic.partition.mode=nonstrict;
> CREATE TABLE limitoffset_text (key STRING, value STRING) PARTITIONED BY (ds STRING, hr
STRING);
> CREATE TABLE limitoffset (key STRING, value STRING) PARTITIONED BY (ds STRING, hr STRING)
STORED AS orc;
> create table acid_dynamic(key STRING, value STRING) PARTITIONED BY (ds STRING, hr STRING)

> clustered by (key) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');
> insert INTO TABLE limitoffset PARTITION (ds, hr) select * from srcpart;
> insert INTO TABLE limitoffset_text PARTITION (ds, hr) select * from srcpart;
> insert INTO TABLE acid_dynamic PARTITION (ds, hr) select * from srcpart;
> select count(key) from limitoffset_text;
> select count(key) from limitoffset;
> select count(key) from acid_dynamic;
> SELECT limitoffset_text.key FROM limitoffset_text LIMIT 490,200;
> SELECT acid_dynamic.key FROM acid_dynamic LIMIT 490,200;
> SELECT limitoffset.key FROM limitoffset LIMIT 490,200;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message