hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1217) [piggybank] evaluation.util.Top is broken
Date Thu, 04 Feb 2010 18:40:28 GMT

    [ https://issues.apache.org/jira/browse/PIG-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829688#action_12829688
] 

Alan Gates commented on PIG-1217:
---------------------------------

In general, looks good.  A comment on Top.Initial.  If you do something like

B = group A ...
C = foreach B generate myudf(A);

and myudf is algebraic, you are guaranteed to only get one record at a time in the Initial
function because Pig doesn't do any collecting of the keys.  That is, even if ten records
in a row have the same key Pig won't detect that and collate them into the bag before calling
Initial.  We take advantage of that in a number of the built in functions (eg COUNT) to make
the processing of Initial easier.  You may want to do the same here.

As far as getting it into 0.6 release, I think Olga was trying to roll the package today or
tomorrow, so we may be out of time.

> [piggybank] evaluation.util.Top is broken
> -----------------------------------------
>
>                 Key: PIG-1217
>                 URL: https://issues.apache.org/jira/browse/PIG-1217
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1, 0.4.0, site, 0.5.0, 0.6.0, 0.7.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: fix_top_udf.diff, fix_top_udf.diff
>
>
> The Top udf has been broken for a while, due to an incorrect implementation of getArgToFuncMapping.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message