hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
Date Wed, 14 Oct 2015 19:08:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14957534#comment-14957534
] 

Sergey Shelukhin commented on HIVE-11531:
-----------------------------------------

Skimmed the patch.
{noformat}
+  private final HashMap<String, Integer> destToLimitOffset;
{noformat}
This could reuse the existing hashmap and store a pair, or put both ints into a long.

Otherwise the approach makes sense.
[~jcamachorodriguez] do you want to take a look and maybe give some pointers for CBO?


> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-11531
>                 URL: https://issues.apache.org/jira/browse/HIVE-11531
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Sergey Shelukhin
>            Assignee: Hui Zheng
>         Attachments: HIVE-11531.WIP.1.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the form SELECT
... LIMIT X,Y where X,Y are coordinates inside the result to be paginated (which can be extremely
large by itself). At present, ROW_NUMBER can be used to achieve this effect, but optimizations
for LIMIT such as TopN in ReduceSink do not apply to ROW_NUMBER. We can add first class support
for "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message