pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-4458) Support UDFs in a FOREACH Before a Merge Join
Date Wed, 25 Mar 2015 20:06:53 GMT

     [ https://issues.apache.org/jira/browse/PIG-4458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rohini Palaniswamy updated PIG-4458:
------------------------------------
    Attachment: PIG-4458-FixTestFailure.patch

TestMergeJoin.testMergeJoinWithUDF fails as it refers to piggybank ABS udf instead of the
builtin one. Attached patch fixes it.

> Support UDFs in a FOREACH Before a Merge Join
> ---------------------------------------------
>
>                 Key: PIG-4458
>                 URL: https://issues.apache.org/jira/browse/PIG-4458
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: William Watson
>            Assignee: William Watson
>             Fix For: 0.15.0
>
>         Attachments: PIG-4458-FixTestFailure.patch, PIG-4458.04.remove-merge-join-udf-restriction.patch,
PIG-4458.05.remove-merge-join-udf-restriction.patch
>
>
> Right now, the MapSideMergeValidator outright rejects any foreach that has a UDF in it:
> {code}
> private boolean isAcceptableForEachOp(Operator lo) throws LogicalToPhysicalTranslatorException
{
>         if (lo instanceof LOForEach) {
>             OperatorPlan innerPlan = ((LOForEach) lo).getInnerPlan();
>             validateMapSideMerge(innerPlan.getSinks(), innerPlan);
>             return !containsUDFs((LOForEach) lo);
>         } else {
>             return false;
>         }
>     }
> {code}
> There is a TODO for this later on in that same class (inside containsUDFs):
> {code}
> // TODO (dvryaboy): in the future we could relax this rule by tracing what fields
> // are being passed into the UDF, and only refusing if the UDF is working on the
> // join key. Transforms of other fields should be ok.
> {code}
> We should do the TODO and relax this requirement or just remove it altogether



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message