pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Moritz Moeller (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (PIG-3246) not possible to use remote filesystems (S3) in a pig script
Date Thu, 14 Mar 2013 19:44:13 GMT

     [ https://issues.apache.org/jira/browse/PIG-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Moritz Moeller resolved PIG-3246.
---------------------------------

    Resolution: Invalid
    
> not possible to use remote filesystems (S3) in a pig script
> -----------------------------------------------------------
>
>                 Key: PIG-3246
>                 URL: https://issues.apache.org/jira/browse/PIG-3246
>             Project: Pig
>          Issue Type: Bug
>         Environment: Apache Pig version 0.10.0-cdh4.2.0 (rexported)
> Hadoop 2.0.0-cdh4.2.0
>            Reporter: Moritz Moeller
>
> My Hadoop cluster is configured using hdfs://namenode/, hdfs dfs + Pig scripts work fine.
> Now I want to read data from S3, hdfs dfs -ls s3n://mybucket/file.csv works fine.
> A Pig script doing LOAD 's3n://mybucket/test.csv' however fails - looks as if Pig is
performing the LOAD request using a hdfs FileSystem.
> I wasn't sure whether to mark this as bug or improvement as I do not know if this should
be possible - but as it is a basic feature for Hadoop I guess it should work in Pig, too.
> org.apache.pig.backend.executionengine.ExecException: ERROR 2118: java.net.UnknownHostException:
mybucket
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:288)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:452)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:469)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366)
> 	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1218)
> 	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1215)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1215)
> 	at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.submit(ControlledJob.java:336)
> 	at org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl.run(JobControl.java:233)
> 	at java.lang.Thread.run(Thread.java:722)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:257)
> Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: sdfa
> 	at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:414)
> 	at org.apache.hadoop.security.SecurityUtil.buildDTServiceName(SecurityUtil.java:295)
> 	at org.apache.hadoop.fs.FileSystem.getCanonicalServiceName(FileSystem.java:247)
> 	at org.apache.hadoop.fs.FileSystem.collectDelegationTokens(FileSystem.java:468)
> 	at org.apache.hadoop.fs.FileSystem.addDelegationTokens(FileSystem.java:452)
> 	at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:121)
> 	at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
> 	at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
> 	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:205)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigTextInputFormat.listStatus(PigTextInputFormat.java:36)
> 	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:269)
> 	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
> 	... 13 more
> Caused by: java.net.UnknownHostException: mybucket
> 	... 25 more

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message