hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] Assigned: (PIG-546) FilterFunc calls empty constructor when it should be calling parameterized constructor
Date Thu, 12 Nov 2009 02:00:40 GMT

     [ https://issues.apache.org/jira/browse/PIG-546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alan Gates reassigned PIG-546:
------------------------------

    Assignee: Santhosh Srinivasan

> FilterFunc calls empty constructor when it should be calling parameterized constructor
> --------------------------------------------------------------------------------------
>
>                 Key: PIG-546
>                 URL: https://issues.apache.org/jira/browse/PIG-546
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.2.0
>            Reporter: Viraj Bhat
>            Assignee: Santhosh Srinivasan
>             Fix For: 0.2.0
>
>         Attachments: FILTERFROMFILE.java, insetfilterfile, mydata.txt, PIG-546.patch
>
>
> The following piece of Pig Script uses a custom UDF known as FILTERFROMFILE which extends
the FilterFunc. It contains two constructors, an empty constructor which is mandatory and
the parameterized constructor. The parameterized constructor  passes the HDFS filename, which
the exec function uses to construct a HashMap. The HashMap is later used for filtering records
based on the match criteria in the HDFS file.
> {code}
> register util.jar;
> --util.jar contains the FILTERFROMFILE class
> define FILTER_CRITERION util.FILTERFROMFILE('/user/viraj/insetfilterfile');
> RAW_LOGS = load 'mydata.txt' as (url:chararray, numvisits:int);
> FILTERED_LOGS = filter RAW_LOGS by FILTER_CRITERION(numvisits);
> dump FILTERED_LOGS;
> {code}
> When you execute the above script,  it results in a single Map only job with 1 Map. It
seems that the empty constructor is called 5 times, and ultimately results in failure of the
job.
> ===========================================
> parameterized constructor: /user/viraj/insetfilterfile
> parameterized constructor: /user/viraj/insetfilterfile
> empty constructor
> empty constructor
> empty constructor
> empty constructor
> empty constructor
> ===========================================
> Error in the Hadoop backend
> ===========================================
> java.lang.IllegalArgumentException: Can not create a Path from an empty string
> 	at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
> 	at org.apache.hadoop.fs.Path.(Path.java:90)
> 	at org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:199)
> 	at org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:130)
> 	at org.apache.pig.impl.io.FileLocalizer.openDFSFile(FileLocalizer.java:164)
> 	at util.FILTERFROMFILE.init(FILTERFROMFILE.java:70)
> 	at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:89)
> 	at util.FILTERFROMFILE.exec(FILTERFROMFILE.java:52)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:179)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:217)
> 	at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POFilter.getNext(POFilter.java:148)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:170)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:158)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
> 	at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2209)
> ===========================================
> Attaching the sample data and the filter function UDF.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message