hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-563) PERFORMANCE: enable combiner to be called 0 or more times whenver the combiner is used for a pig query
Date Sat, 20 Dec 2008 16:51:44 GMT

    [ https://issues.apache.org/jira/browse/PIG-563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658298#action_12658298
] 

Alan Gates commented on PIG-563:
--------------------------------

One question on COUNT.  Why is the Count.Initial.exec pulling the first argument out of the
passed in bag and returning that?  Shouldn't it just be returning a 1 no matter what?

Other than that, +1.

> PERFORMANCE: enable combiner to be called 0 or more times whenver the combiner is used
for a pig query
> ------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-563
>                 URL: https://issues.apache.org/jira/browse/PIG-563
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: types_branch
>            Reporter: Pradeep Kamath
>            Assignee: Pradeep Kamath
>             Fix For: types_branch
>
>         Attachments: PIG-563-v2.patch, PIG-563.patch
>
>
> Currently Pig's use of the combiner assumes the combiner is called exactly once in Hadoop.
With Hadoop 18, the combiner could be called 0, 1 or more times. This issue is to track changes
needed in the CombinerOptimizer visitor and the builtin Algebraic UDFS (SUM, COUNT, MIN, MAX,
AVG) to be able to work in this new model.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message