hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Viraj Bhat (JIRA)" <j...@apache.org>
Subject [jira] Created: (PIG-1378) har url not usable in Pig scripts
Date Wed, 14 Apr 2010 22:20:50 GMT
har url not usable in Pig scripts
---------------------------------

                 Key: PIG-1378
                 URL: https://issues.apache.org/jira/browse/PIG-1378
             Project: Pig
          Issue Type: Bug
          Components: impl
    Affects Versions: 0.7.0
            Reporter: Viraj Bhat
             Fix For: 0.7.0


I am trying to use har (Hadoop Archives) in my Pig script.

I can use them through the HDFS shell
{noformat}
$hadoop fs -ls 'har:///user/viraj/project/subproject/files/size/data'
Found 1 items
-rw-------   5 viraj users    1537234 2010-04-14 09:49 user/viraj/project/subproject/files/size/data/part-00001
{noformat}

Using similar URL's in grunt yields
{noformat}
grunt> a = load 'har:///user/viraj/project/subproject/files/size/data'; 
grunt> dump a;
{noformat}


{noformat}
2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled
internal error. org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible
file URI scheme: har : hdfs
2010-04-14 22:08:48,814 [main] WARN  org.apache.pig.tools.grunt.Grunt - There is no log file
to write to.
2010-04-14 22:08:48,814 [main] ERROR org.apache.pig.tools.grunt.Grunt - java.lang.Error: org.apache.pig.impl.logicalLayer.FrontendException:
ERROR 0: Incompatible file URI scheme: har : hdfs
        at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1483)
        at org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1245)
        at org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:911)
        at org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:700)
        at org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
        at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1164)
        at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1114)
        at org.apache.pig.PigServer.registerQuery(PigServer.java:425)
        at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:737)
        at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:162)
        at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
        at org.apache.pig.Main.main(Main.java:357)
Caused by: org.apache.pig.impl.logicalLayer.FrontendException: ERROR 0: Incompatible file
URI scheme: har : hdfs
        at org.apache.pig.LoadFunc.getAbsolutePath(LoadFunc.java:249)
        at org.apache.pig.LoadFunc.relativeToAbsolutePath(LoadFunc.java:62)
        at org.apache.pig.impl.logicalLayer.parser.QueryParser.LoadClause(QueryParser.java:1472)
        ... 13 more
{noformat}

According to Jira http://issues.apache.org/jira/browse/PIG-1234 I try the following as stated
in the original description

{noformat}
grunt> a = load 'har://namenode-location/user/viraj/project/subproject/files/size/data';

grunt> dump a;
{noformat}

{noformat}
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2118: Unable to create
input splits for: har://namenode-location/user/viraj/project/subproject/files/size/data';

        ... 8 more
Caused by: java.io.IOException: No FileSystem for scheme: mithrilgold
        at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1375)
        at .apache.hadoop.fs.FileSystem.access(200(FileSystem.java:66)
        at .apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
        at .apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
        at .apache.hadoop.fs.HarFileSystem.initialize(HarFileSystem.java:104)
        at .apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
        at .apache.hadoop.fs.FileSystem.get(FileSystem.java:193)
        at .apache.hadoop.fs.Path.getFileSystem(Path.java:175)
        at .apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:208)
        at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
        at .apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:246)
        at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:245)
{noformat}

Viraj

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message