hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt McCline (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HIVE-12369) Native Vector GroupBy
Date Thu, 03 Aug 2017 05:02:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112186#comment-16112186
] 

Matt McCline edited comment on HIVE-12369 at 8/3/17 5:01 AM:
-------------------------------------------------------------

Yes, I think you should continue reviewing.  The path that is implemented is One Long Key
and groupByMode == HASH.  There are UNDONEs for *subsequent* JIRAs that later adds Aggregation
of non-Long data types, Fixed Length Keys / Variable Length Keys, and the other groupByModes.
 And later adds Grouping Sets, Empty Aggregation (i.e. GroupBy on key that has no aggregations
that does duplicate key elimination), too.


was (Author: mmccline):
Yes, I think you continue reviewing.  The path that is implemented is One Long Key and groupByMode
== HASH.  There are UNDONEs for *subsequent* JIRAs that later adds Aggregation of non-Long
data types, Fixed Length Keys / Variable Length Keys, and the other groupByModes.  And later
adds Grouping Sets, Empty Aggregation (i.e. GroupBy on key that has no aggregations that does
duplicate key elimination), too.

> Native Vector GroupBy
> ---------------------
>
>                 Key: HIVE-12369
>                 URL: https://issues.apache.org/jira/browse/HIVE-12369
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: HIVE-12369.01.patch, HIVE-12369.02.patch, HIVE-12369.05.patch, HIVE-12369.06.patch
>
>
> Implement Native Vector GroupBy using fast hash table technology developed for Native
Vector MapJoin, etc.
> Patch is currently limited to a single Long key, aggregation on Long columns, no more
than 31 columns.
> 3 new classes introduces that stored the count in the slot table and don't allocate hash
elements:
> {noformat}
>   COUNT(column)  VectorGroupByHashOneLongKeyCountColumnOperator      
>   COUNT(key)     VectorGroupByHashOneLongKeyCountKeyOperator            
>   COUNT(*)       VectorGroupByHashOneLongKeyCountStarOperator           
> {noformat}
> And a new class that aggregates a single Long key:
> {noformat}
>   VectorGroupByHashOneLongKeyOperator
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message