hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-522) GenericUDAF: Extend UDAF to deal with complex types
Date Wed, 03 Jun 2009 05:50:07 GMT

     [ https://issues.apache.org/jira/browse/HIVE-522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zheng Shao updated HIVE-522:
----------------------------

    Attachment: HIVE-522.1.patch

A preliminary patch that includes all new classes. They are not integrated with GroupByOperator
yet but the integration work is pretty straight-forward.

GenericUDAF is more complex than I thought at first. So I've created a bunch of classes for
it:

1, GenericUDAFResolver: takes a function name and the list of parameter TypeInfo and returns
a GenericUDAFEvaluator.
2. GenericUDAFEvaluator: allows 2 things:
2.1 Create a new aggregation result buffer
2.2 Update an aggregation result buffer, or terminate the aggregation and get the results.
3. The aggregation result buffer in step 2 is an interface. Each GenericUDAFEvaluator should
have its own aggregation result buffer class to store the data (for example, a count for count(),
a count and a sum for average()).

1 is used at compile time. 2 and 3 are at runtime.

The reason that I split 2 and 3 is:
A. It shrinks the size of the aggregation result buffer size - only a "long" is needed for
count. (input's ObjectInspector and output writable Object (e.g. Long or LongWritable of count())
are both stored in GenericUDAFEvaluator).
B. It makes it easier to move to HIVE-535: A3 in the future.



> GenericUDAF: Extend UDAF to deal with complex types
> ---------------------------------------------------
>
>                 Key: HIVE-522
>                 URL: https://issues.apache.org/jira/browse/HIVE-522
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.4.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.4.0
>
>         Attachments: HIVE-522.1.patch
>
>
> We can pass arbitrary arguments into GenericUDFs. We should do the same thing to GenericUDAF
so that UDAF can also take arbitrary arguments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message