hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-949) Object deepCopy in GroupBy Operator
Date Fri, 27 Nov 2009 00:22:39 GMT

     [ https://issues.apache.org/jira/browse/HIVE-949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

He Yongqiang updated HIVE-949:

    Attachment: hive-949-2009-11-26.patch

This saves about 45% CPU time of GroupByOperator.processHashAggr. (393,686 ms -> 216,860

> Object deepCopy in GroupBy Operator
> -----------------------------------
>                 Key: HIVE-949
>                 URL: https://issues.apache.org/jira/browse/HIVE-949
>             Project: Hadoop Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: He Yongqiang
>         Attachments: hive-949-2009-11-26.patch
> In GroupByOperator, objects are first deep copied and then check whether or not the object
is in the hash table (in hash-mode aggregation). In fact, object deep copy could be very expensive
(around 5% CPU time). A simple change could be generate the object without deep copy through
ObjectInspector and check its existence in the hash table. If not exists, we call deep copy.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message