hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-522) GenericUDAF: Extend UDAF to deal with complex types
Date Wed, 03 Jun 2009 05:50:07 GMT

     [ https://issues.apache.org/jira/browse/HIVE-522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Zheng Shao updated HIVE-522:

    Attachment: HIVE-522.1.patch

A preliminary patch that includes all new classes. They are not integrated with GroupByOperator
yet but the integration work is pretty straight-forward.

GenericUDAF is more complex than I thought at first. So I've created a bunch of classes for

1, GenericUDAFResolver: takes a function name and the list of parameter TypeInfo and returns
a GenericUDAFEvaluator.
2. GenericUDAFEvaluator: allows 2 things:
2.1 Create a new aggregation result buffer
2.2 Update an aggregation result buffer, or terminate the aggregation and get the results.
3. The aggregation result buffer in step 2 is an interface. Each GenericUDAFEvaluator should
have its own aggregation result buffer class to store the data (for example, a count for count(),
a count and a sum for average()).

1 is used at compile time. 2 and 3 are at runtime.

The reason that I split 2 and 3 is:
A. It shrinks the size of the aggregation result buffer size - only a "long" is needed for
count. (input's ObjectInspector and output writable Object (e.g. Long or LongWritable of count())
are both stored in GenericUDAFEvaluator).
B. It makes it easier to move to HIVE-535: A3 in the future.

> GenericUDAF: Extend UDAF to deal with complex types
> ---------------------------------------------------
>                 Key: HIVE-522
>                 URL: https://issues.apache.org/jira/browse/HIVE-522
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.4.0
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.4.0
>         Attachments: HIVE-522.1.patch
> We can pass arbitrary arguments into GenericUDFs. We should do the same thing to GenericUDAF
so that UDAF can also take arbitrary arguments.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message