hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <>
Subject [jira] Commented: (HIVE-1738) Optimize Key Comparison in GroupByOperator
Date Fri, 22 Oct 2010 05:18:16 GMT


Zheng Shao commented on HIVE-1738:

+1. This is smart!

> public boolean areEqual(ArrayList<Object> ol0, ArrayList<Object> ol1) ...
Why do we need to care about the case that the 2 array lists are different in size / shorter
than numFields?

> for (int i = 0; i < numFields; i++) {

We might want to try comparing the last field first.  The reason is that in sort-based aggregation,
the last key is more likely to be different than the first key. Not sure the effect is big
enough to be noticeable though.

> Optimize Key Comparison in GroupByOperator
> ------------------------------------------
>                 Key: HIVE-1738
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Siying Dong
>            Assignee: Siying Dong
>             Fix For: 0.7.0
>         Attachments: HIVE.1738.1.patch, HIVE.1738.2.patch, HIVE.1738.3.patch
> GroupByOperator uses to compare keys, which is written
for generalized object comparisons, which is not optimized for group-by operator. By optimizing
this logic, we expect to see obvious improvements in GroupByOperator.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message