hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Viraj Bhat (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-974) Issues with mv command when used after store when using -param_file/-param options
Date Thu, 24 Sep 2009 01:29:16 GMT

    [ https://issues.apache.org/jira/browse/PIG-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12758962#action_12758962
] 

Viraj Bhat commented on PIG-974:
--------------------------------

It turns out that the problem was due to single quotes.
{code}
mv '$finalop' '$finalmove';
{code}

This piece of modified script should work..
{code}
mv $finalop $finalmove;
{code}

The hard part here is when to use single quotes for parameters and when we should not..This
is not documented in the manual.

The error message is also confusing..
===========================================================
java.io.IOException: File or directory '/user/viraj/finaloutput' does not exist.
===========================================================

I thought that the single quotes against the filename printed in the error message refers
to the correct file name.

{code}
$shell>hadoop fs -ls '/user/viraj/finaloutput' 
Found 1 items
-rw-------   3 viraj users        420 2009-09-24 01:16 /user/viraj/finaloutput/part-00000
{code}

Thanks Viraj

> Issues with mv command when used after store when using -param_file/-param options
> ----------------------------------------------------------------------------------
>
>                 Key: PIG-974
>                 URL: https://issues.apache.org/jira/browse/PIG-974
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>         Environment: Hadoop 18 and 20
>            Reporter: Viraj Bhat
>             Fix For: 0.6.0
>
>         Attachments: studenttab10k
>
>
> I have a Pig script which moves the final output to another HDFS directory to signal
completion, so that another Pig script can start working on these results.
> {code}
> studenttab = LOAD '/user/viraj/studenttab10k' AS (name:chararray, age:int,gpa:float);
> X = GROUP studenttab by age;
> Y = FOREACH X GENERATE group, COUNT(studenttab);
> store Y into '$finalop' using PigStorage();
> mv '$finalop' '$finalmove';
> {code}
> where "finalop" and "finalmove" are parameters used storing intermediate and final results.
> I run this script as this:
> {code}
> $shell> java -cp pig20.jar:/path/tohadoop/site.xml -Dmapred.job.queue.name=default
org.apache.pig.Main -M -param finalop=/user/viraj/finaloutput -param finalmove=/user/viraj/finalmove
testmove.pig 
> {code}
> or using the param_file option
> {code}
> $shell>java -cp pig20.jar:/path/tohadoop/site.xml -Dmapred.job.queue.name=default
org.apache.pig.Main -M -param_file moveparamfile  testmove.pig
> {code}
> ================================================================================
> The underlying Map Reduce jobs run well but the move command seems to be failing:
> ================================================================================
> 2009-09-23 23:26:21,781 [main] INFO  org.apache.pig.Main - Logging error messages to:
/homes/viraj/pigscripts/pig_1253748381778.log
> 2009-09-23 23:26:21,963 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to hadoop file system at: hdfs://localhost:8020
> 2009-09-23 23:26:22,227 [main] INFO  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
- Connecting to map-reduce job tracker at: localhost:50300
> 2009-09-23 23:26:27,187 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer
- Choosing to move algebraic foreach to combiner
> 2009-09-23 23:26:27,203 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 1
> 2009-09-23 23:26:27,203 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 1
> 2009-09-23 23:26:28,828 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
> 2009-09-23 23:26:29,423 [Thread-9] WARN  org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser
for parsing the arguments. Applications should implement Tool for the same.
> 2009-09-23 23:26:29,478 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
> 2009-09-23 23:27:29,828 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 50% complete
> 2009-09-23 23:27:59,764 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 50% complete
> 2009-09-23 23:28:57,249 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
> 2009-09-23 23:28:57,249 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Successfully stored result in: "/user/viraj/finaloutput"
> 2009-09-23 23:28:57,267 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Records written : 60
> 2009-09-23 23:28:57,267 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Bytes written : 420
> 2009-09-23 23:28:57,267 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
> 2009-09-23 23:28:57,367 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled
internal error. File or directory '/user/viraj/finaloutput' does not exist.
> Details at logfile: /homes/viraj/pigscripts/pig_1253748381778.log
> ================================================================================
> {code}
> $shell> hadoop fs -ls /user/viraj/finaloutput 
> Found 1 items
> -rw-------   3 viraj users        420 2009-09-23 23:42 /user/viraj/finaloutput/part-00000
> {code}
> ================================================================================
> Opening the log file:
> ================================================================================
> Pig Stack Trace
> ---------------
> ERROR 2998: Unhandled internal error. File or directory '/user/viraj/finaloutput' does
not exist.
> java.io.IOException: File or directory '/user/viraj/finaloutput' does not exist.
>         at org.apache.pig.tools.grunt.GruntParser.processMove(GruntParser.java:641)
>         at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:264)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
>         at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
>         at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89)
>         at org.apache.pig.Main.main(Main.java:397)
> ================================================================================
> Viraj

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message