Return-Path: X-Original-To: apmail-datafu-dev-archive@minotaur.apache.org Delivered-To: apmail-datafu-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B212717547 for ; Sun, 28 Sep 2014 14:05:58 +0000 (UTC) Received: (qmail 61126 invoked by uid 500); 28 Sep 2014 14:05:58 -0000 Delivered-To: apmail-datafu-dev-archive@datafu.apache.org Received: (qmail 61079 invoked by uid 500); 28 Sep 2014 14:05:58 -0000 Mailing-List: contact dev-help@datafu.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@datafu.incubator.apache.org Delivered-To: mailing list dev@datafu.incubator.apache.org Received: (qmail 61068 invoked by uid 99); 28 Sep 2014 14:05:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 28 Sep 2014 14:05:58 +0000 X-ASF-Spam-Status: No, hits=-2000.6 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Sun, 28 Sep 2014 14:05:36 +0000 Received: (qmail 60976 invoked by uid 99); 28 Sep 2014 14:05:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 28 Sep 2014 14:05:34 +0000 Date: Sun, 28 Sep 2014 14:05:33 +0000 (UTC) From: "Matthew Hayes (JIRA)" To: dev@datafu.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (DATAFU-68) SampleByKey can throw NullPointerException MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/DATAFU-68?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthew Hayes resolved DATAFU-68. --------------------------------- Resolution: Fixed > SampleByKey can throw NullPointerException > ------------------------------------------ > > Key: DATAFU-68 > URL: https://issues.apache.org/jira/browse/DATAFU-68 > Project: DataFu > Issue Type: Bug > Reporter: Jarek Jarcec Cecho > Assignee: Jarek Jarcec Cecho > Fix For: 1.3.0 > > Attachments: DATAFU-68.patch, DATAFU-68.patch > > > I've noticed that {{SampleByKey}} can throw {{NullPointerException}}: > {code} > Caused by: java.lang.NullPointerException > at datafu.pig.sampling.SampleByKey.setUDFContextSignature(SampleByKey.java:86) > at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.setSignature(POUserFunc.java:604) > at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.instantiateFunc(POUserFunc.java:127) > at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.(POUserFunc.java:122) > at org.apache.pig.newplan.logical.expression.ExpToPhyTranslationVisitor.visit(ExpToPhyTranslationVisitor.java:505) > at org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:112) > at org.apache.pig.newplan.ReverseDependencyOrderWalkerWOSeenChk.walk(ReverseDependencyOrderWalkerWOSeenChk.java:69) > at org.apache.pig.newplan.logical.relational.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:220) > at org.apache.pig.newplan.logical.relational.LOFilter.accept(LOFilter.java:79) > at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75) > at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52) > at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:310) > at org.apache.pig.PigServer.compilePp(PigServer.java:1380) > at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1305) > at org.apache.pig.PigServer.storeEx(PigServer.java:978) > at org.apache.pig.PigServer.store(PigServer.java:942) > at org.apache.pig.Pig > {code} > I've reproduced the behaviour on old 1.1.0 version, but the UDF in question did not change much since then and hence I'm assuming that trunk will be affected the same way. Script that reproduces the issue is simple: > {code} > grunt> DEFINE SampleByKey datafu.pig.sampling.SampleByKey('0.5'); > grunt> data = LOAD 'datafu/input_datafu' AS (A_id:chararray, B_id:chararray, C:int); > grunt> out = FILTER data BY SampleByKey(A_id); > grunt> DUMP out; > {code} > The problem seems to be that method {{setUDFContextSignature}} can be called with {{null}} argument that breaks our code. The documentation for this method is not specific whether {{null}} is or isn't allowed. I've looked into other UDFs in Pig and it seems that they are handling the case when signature is {{null}} and hence I've decided to fix {{SampleByKey}} as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)