hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sivaramakrishnan Narayanan (JIRA)" <>
Subject [jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage
Date Mon, 19 Nov 2012 11:34:58 GMT


Sivaramakrishnan Narayanan commented on HIVE-3562:

Apologies, you can use a heap to maintain a top-k as opposed to an array or a linked list.

You may also want to consider the case where the top-k do not fit in memory. One possibility
would be to employ this optimization only if K is less than some threshold.

This approach has the advantage that it is a Hive-only change and does not depend on a Hadoop
change. That is a pretty big plus.
> Some limit can be pushed down to map stage
> ------------------------------------------
>                 Key: HIVE-3562
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Trivial
>         Attachments: HIVE-3562.D5967.1.patch
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> But LIMIT can be partially calculated in RS, reducing size of shuffling.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message