hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-979) Acummulator Interface for UDFs
Date Mon, 28 Sep 2009 21:27:16 GMT

    [ https://issues.apache.org/jira/browse/PIG-979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12760389#action_12760389

Alan Gates commented on PIG-979:

Jeff, thanks for the paper.  I looked over it and I'm not certain it directly applies.  They
are measuring both the aggregation time (sort or hash) and how it is passed to the user defined
aggregate (iterate or accumulate).  Being in Hadoop we already have the aggregation done.
 So it's just a question of the fastest way to make the data available to the UDF.  As I said
above, we want to test the performance of this and prove its worth before we add it.

As a general complaint, they used a fairly old revision of Pig code in their paper, even though
it appears it was published in the last few months.

> Acummulator Interface for UDFs
> ------------------------------
>                 Key: PIG-979
>                 URL: https://issues.apache.org/jira/browse/PIG-979
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Alan Gates
>            Assignee: Ying He
> Add an accumulator interface for UDFs that would allow them to take a set number of records
at a time instead of the entire bag.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message