hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hui Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
Date Tue, 13 Oct 2015 10:35:05 GMT

     [ https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Hui Zheng updated HIVE-11531:
-----------------------------
    Attachment: HIVE-11531.WIP.1.patch

Hi [~sershe]
To get some opinion from you I uploaded a patch which implemented the mysql-style LIMIT simply
but isn't completely finished.
Next I will implement it with CBO and research how to improve the Optimizers(GroupByOptimizer
and GlobalLimitOptimizer) .At last I will do more tests.


> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-11531
>                 URL: https://issues.apache.org/jira/browse/HIVE-11531
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Hui Zheng
>         Attachments: HIVE-11531.WIP.1.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the form SELECT
... LIMIT X,Y where X,Y are coordinates inside the result to be paginated (which can be extremely
large by itself). At present, ROW_NUMBER can be used to achieve this effect, but optimizations
for LIMIT such as TopN in ReduceSink do not apply to ROW_NUMBER. We can add first class support
for "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message