hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn Ramberg (JIRA) <j...@apache.org>
Subject [jira] [Created] (HADOOP-10218) Using brace glob pattern in S3 URL causes exception due to Path created with empty string
Date Fri, 10 Jan 2014 11:32:50 GMT
Björn Ramberg created HADOOP-10218:
--------------------------------------

             Summary: Using brace glob pattern in S3 URL causes exception due to Path created
with empty string
                 Key: HADOOP-10218
                 URL: https://issues.apache.org/jira/browse/HADOOP-10218
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs/s3
    Affects Versions: 1.2.1
            Reporter: Björn Ramberg


When using a brace glob pattern inside a S3 URL, an exception is thrown because a Path is
constructed with the empty string. The simplest reproduction case I've found is:

{code:none}
$ hadoop fs -ls 's3n://public-read-access-bucket/{foo,bar}'
ls: Can not create a Path from an empty string
{code}

It does not seem to make a difference whether any file exists that match the pattern. The
problem only seems to affect buckets with public read access. The private buckets tried seem
to work fine. When running through a Hadoop step, the following backtrace was produced:

{code:none}
Exception in thread "main" java.lang.IllegalArgumentException: Can not create a Path from
an empty string
	at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
	at org.apache.hadoop.fs.Path.<init>(Path.java:90)
	at org.apache.hadoop.fs.Path.<init>(Path.java:50)
	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.listStatus(NativeS3FileSystem.java:856)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:844)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:904)
	at org.apache.hadoop.fs.FileSystem.globPathsLevel(FileSystem.java:1082)
	at org.apache.hadoop.fs.FileSystem.globStatusInternal(FileSystem.java:1025)
	at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:989)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:215)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252)
	at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1017)
	at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1034)
	at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:174)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:952)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:905)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:905)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
	at rubydoop.RubydoopJobRunner.run(RubydoopJobRunner.java:29)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
	at rubydoop.RubydoopJobRunner.main(RubydoopJobRunner.java:74)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:187)
{code}

Furthermore, interestingly, the following works:

{code:none}
$ hadoop fs -ls 's3n://public-read-access-bucket/{foo/,bar/}{baz,qux}'
{code}

but this fails:

{code:none}
$ hadoop fs -ls 's3n://public-read-access-bucket/{foo,bar}/{baz,qux}'
{code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message