hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sivaramakrishnan Narayanan (JIRA)" <>
Subject [jira] [Commented] (HIVE-3562) Some limit can be pushed down to map stage
Date Mon, 19 Nov 2012 10:17:01 GMT


Sivaramakrishnan Narayanan commented on HIVE-3562:

I'm interested in this particular optimization. Let's say the table src have N rows and we're
interested in top-K. If the rows in T are in almost descending order and we're interested
in ascending Top-K (this is very likely when ordering by timestamps), then the number of memcopies
will be N * K. See code fragment:

+    public boolean isTopN(byte[] key) {
+      int index = Arrays.binarySearch(keys, key, C);
+      index = index < 0 ? -index -1 : index;
+      if (index >= keys.length - 1) {
+        return false;
+      }
+      System.arraycopy(keys, index, keys, index + 1, keys.length - index - 1);
+      keys[index] = Arrays.copyOf(key, key.length);
+      return true;
+    }
+  }

You could use a linked list, but binary search is not an option in that case.

An alternate approach to the problem is to use a combination of Hive and Hadoop changes.

Hadoop change:
* New parameter map.sort.limitrecords which determines how many records each mapper in a job
will send to every reducer
* When writing out local files after sorting, map-task stops after map.sort.limitrecords records
for each reducer
* Effectively, each mapper sends out its top-K records

Hive change:
* Determining when the Top-K optimization is applicable and setting K in ReduceSinkDesc
* Passing the K value along to MapredWork
* ExecDriver sets map.sort.limitrecords before executing the job corresponding to the MapredWork

This change will reduce the amount of I/O that happens on the map-side (writing only 10 rows
per reducer as opposed to entire table) and can have a big effect on performance. Furthermore,
it is possible to make the sort on the mapper side a top-k sort which can further improve
performance - but the deep pocket is really the I/O savings. In my experiments, I see a 5x
performance improvement for such queries.
> Some limit can be pushed down to map stage
> ------------------------------------------
>                 Key: HIVE-3562
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Trivial
>         Attachments: HIVE-3562.D5967.1.patch
> Queries with limit clause (with reasonable number), for example
> {noformat}
> select * from src order by key limit 10;
> {noformat}
> makes operator tree, 
> But LIMIT can be partially calculated in RS, reducing size of shuffling.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message