hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rui Li (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-6348) Order by/Sort by in subquery
Date Tue, 27 Jun 2017 08:10:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rui Li updated HIVE-6348:
-------------------------
    Attachment: HIVE-6348.4.patch

Patch v4 updates the affected tests.
# I modified {{input20_script.py}} so it doesn't expect ordered input and gives the same results.
# I disabled the optimization in {{correlationoptimizer14.q}} and {{reduce_deduplicate_extended.q}}.
Otherwise most of the test cases in these two qfiles become pointless.
# For {{concat_op.q}}, I removed the collect_list UDF in output. Because the column order
in the list is not guaranteed.

Other golden files only have changes in query plan, not the results. Will add an RB for this.

> Order by/Sort by in subquery
> ----------------------------
>
>                 Key: HIVE-6348
>                 URL: https://issues.apache.org/jira/browse/HIVE-6348
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Gunther Hagleitner
>            Assignee: Rui Li
>            Priority: Minor
>              Labels: sub-query
>         Attachments: HIVE-6348.1.patch, HIVE-6348.2.patch, HIVE-6348.3.patch, HIVE-6348.4.patch
>
>
> select * from (select * from foo order by c asc) bar order by c desc;
> in hive sorts the data set twice. The optimizer should probably remove any order by/sort
by in the sub query unless you use 'limit '. Could even go so far as barring it at the semantic
level.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message