datafu-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Hayes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DATAFU-68) SampleByKey can throw NullPointerException
Date Sun, 28 Sep 2014 14:05:33 GMT

    [ https://issues.apache.org/jira/browse/DATAFU-68?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151110#comment-14151110
] 

Matthew Hayes commented on DATAFU-68:
-------------------------------------

Looks good to me :)  Pushed the commit.

> SampleByKey can throw NullPointerException
> ------------------------------------------
>
>                 Key: DATAFU-68
>                 URL: https://issues.apache.org/jira/browse/DATAFU-68
>             Project: DataFu
>          Issue Type: Bug
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Jarek Jarcec Cecho
>             Fix For: 1.3.0
>
>         Attachments: DATAFU-68.patch, DATAFU-68.patch
>
>
> I've noticed that {{SampleByKey}} can throw {{NullPointerException}}:
> {code}
> Caused by: java.lang.NullPointerException
> 	at datafu.pig.sampling.SampleByKey.setUDFContextSignature(SampleByKey.java:86)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.setSignature(POUserFunc.java:604)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.instantiateFunc(POUserFunc.java:127)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.<init>(POUserFunc.java:122)
> 	at org.apache.pig.newplan.logical.expression.ExpToPhyTranslationVisitor.visit(ExpToPhyTranslationVisitor.java:505)
> 	at org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:112)
> 	at org.apache.pig.newplan.ReverseDependencyOrderWalkerWOSeenChk.walk(ReverseDependencyOrderWalkerWOSeenChk.java:69)
> 	at org.apache.pig.newplan.logical.relational.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:220)
> 	at org.apache.pig.newplan.logical.relational.LOFilter.accept(LOFilter.java:79)
> 	at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
> 	at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
> 	at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:310)
> 	at org.apache.pig.PigServer.compilePp(PigServer.java:1380)
> 	at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1305)
> 	at org.apache.pig.PigServer.storeEx(PigServer.java:978)
> 	at org.apache.pig.PigServer.store(PigServer.java:942)
> 	at org.apache.pig.Pig
> {code}
> I've reproduced the behaviour on old 1.1.0 version, but the UDF in question did not change
much since then and hence I'm assuming that trunk will be affected the same way. Script that
reproduces the issue is simple:
> {code}
> grunt> DEFINE SampleByKey datafu.pig.sampling.SampleByKey('0.5'); 
> grunt> data = LOAD 'datafu/input_datafu' AS (A_id:chararray, B_id:chararray, C:int);
> grunt> out = FILTER data BY SampleByKey(A_id); 
> grunt> DUMP out;
> {code}
> The problem seems to be that method {{setUDFContextSignature}} can be called with {{null}}
argument that breaks our code. The documentation for this method is not specific whether {{null}}
is or isn't allowed. I've looked into other UDFs in Pig and it seems that they are handling
the case when signature is {{null}} and hence I've decided to fix {{SampleByKey}} as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message