pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Koji Noguchi (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-5224) Extra foreach from ColumnPrune preventing Accumulator usage
Date Thu, 13 Apr 2017 21:15:41 GMT

     [ https://issues.apache.org/jira/browse/PIG-5224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Koji Noguchi updated PIG-5224:
------------------------------
    Attachment: pig-5224-v0-testonly.patch

Attaching a slight change in test that would reproduce the issue.
This test will fail with 
{noformat}
Caused by: java.io.IOException: exec() should not be called.
    at org.apache.pig.test.utils.AccumulatorBagCount.exec(AccumulatorBagCount.java:56)
    at org.apache.pig.test.utils.AccumulatorBagCount.exec(AccumulatorBagCount.java:28)
    at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:326)
{noformat}

[~rohini] pointed out that Accumulator wasn't used due to extra ForEach inserted by ColumnPrune
between relation 'C' and 'D' to drop the field 'group'.

> Extra foreach from ColumnPrune preventing Accumulator usage
> -----------------------------------------------------------
>
>                 Key: PIG-5224
>                 URL: https://issues.apache.org/jira/browse/PIG-5224
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Koji Noguchi
>         Attachments: pig-5224-v0-testonly.patch
>
>
> {code}
> A = load 'input' as (id:int, fruit);
> B = foreach A generate id; -- to enable columnprune
> C = group B by id;
> D = foreach C {
>     o = order B by id;
>     generate org.apache.pig.test.utils.AccumulatorBagCount(o);
> }
> STORE D into ...
> {code}
> Pig fails to use Accumulator interface for this UDF.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message