hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Ciemiewicz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-755) Difficult to debug parameter substitution problems based on the error messages when running in local mode
Date Wed, 08 Apr 2009 17:16:13 GMT

    [ https://issues.apache.org/jira/browse/PIG-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697106#action_12697106
] 

David Ciemiewicz commented on PIG-755:
--------------------------------------

My recommended solutions:

1)  The pig command should provide a command line option for outputting the resulting script
with all parameter substitutions.

This one simple mechanism would allow developers to see what is and is not happening with
parameter substitution.

I should suggest something akin to gcc --save-temps where the preprocessed output of the parameter
substitutions is saved to a file such as file.pig => file.ppig.

This would be really valuable for any kind of translator to Pig such as SQL to Pig as well.

2) Any load and store statements should indicate the name of file / directory that is being
processed.

The previous version of Pig indicated the name of not only the files appearing in load and
store statements but also the name of any temporary files created as well.

This functionality should work in local mode as well has HDFS mode.

> Difficult to debug parameter substitution problems based on the error messages when running
in local mode
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: PIG-755
>                 URL: https://issues.apache.org/jira/browse/PIG-755
>             Project: Pig
>          Issue Type: Bug
>          Components: grunt
>    Affects Versions: 0.3.0
>            Reporter: Viraj Bhat
>             Fix For: 0.3.0
>
>         Attachments: inputfile.txt, localparamsub.pig
>
>
> I have a script in which I do a parameter substitution for the input file. I have a use
case where I find it difficult to debug based on the error messages in local mode.
> {code}
> A = load '$infile' using PigStorage() as
>      (
>        date            : chararray,
>        count           : long,
>        gmean           : double
>     );
> dump A;
> {code}
> 1) I run it in local mode with the input file in the current working directory
> {code}
> prompt  $ java -cp pig.jar:/path/to/hadoop/conf/ org.apache.pig.Main -exectype local
-param infile='inputfile.txt' localparamsub.pig
> {code}
> 2009-04-07 00:03:51,967 [main] ERROR org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POStore
- Received error from storer function: org.apache.pig.backend.executionengine.ExecException:
ERROR 2081: Unable to setup the load function.
> 2009-04-07 00:03:51,970 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- Failed jobs!!
> 2009-04-07 00:03:51,971 [main] INFO  org.apache.pig.backend.local.executionengine.LocalPigLauncher
- 1 out of 1 failed!
> 2009-04-07 00:03:51,974 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable
to open iterator for alias A
> ====================================================================
> Details at logfile: /home/viraj/pig-svn/trunk/pig_1239062631414.log
> ====================================================================
> ERROR 1066: Unable to open iterator for alias A
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator
for alias A
>         at org.apache.pig.PigServer.openIterator(PigServer.java:439)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:193)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:99)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>         at org.apache.pig.Main.main(Main.java:352)
> Caused by: java.io.IOException: Job terminated with anomalous status FAILED
>         at org.apache.pig.PigServer.openIterator(PigServer.java:433)
>         ... 5 more
> ====================================================================
> 2) I run it in map reduce mode
> {code}
> prompt  $ java -cp pig.jar:/path/to/hadoop/conf/ org.apache.pig.Main -param infile='inputfile.txt'
localparamsub.pig
> {code}
> 2009-04-07 00:07:31,660 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: hdfs://localhost:9000
> 2009-04-07 00:07:32,074 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to map-reduce job tracker at: localhost:9001
> 2009-04-07 00:07:34,543 [Thread-7] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
> 2009-04-07 00:07:39,540 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
> 2009-04-07 00:07:39,540 [main] ERROR org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Map reduce job failed
> 2009-04-07 00:07:39,563 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2100: inputfile
does not exist.
> ====================================================================
> Details at logfile: /home/viraj/pig-svn/trunk/pig_1239062851400.log
> ====================================================================
> ERROR 2100: inputfile does not exist.
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator
for alias A
>         at org.apache.pig.PigServer.openIterator(PigServer.java:439)
>         at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:193)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:99)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:88)
>         at org.apache.pig.Main.main(Main.java:352)
> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to
store alias A
>         at org.apache.pig.PigServer.store(PigServer.java:470)
>         at org.apache.pig.PigServer.openIterator(PigServer.java:427)
>         ... 5 more
> Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to
store alias A
>         at org.apache.pig.PigServer.store(PigServer.java:503)
>         at org.apache.pig.PigServer.store(PigServer.java:466)
>         ... 6 more
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2997: Unable to
recreate exception from backend error: org.apache.pig.backend.executionengine.ExecException:
ERROR 2100: inputfile does not exist.
>         at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:135)
> ====================================================================
> Here is evident that the error occurred because "input.txt" was truncated to "input"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message