Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: message received from 54.76.25.247 which
 is an MX secondary for user@hadoop.apache.org)
MIME-Version: 1.0
Date: Mon, 20 Apr 2015 09:17:06 -0400
Message-ID: 
 <CA+XUwYyCFqD6gRrUT75o6LDP9PFa2Wsymzx6UCQKTJeRvM43Gg@mail.gmail.com>
Subject: Unable to Find S3N Filesystem Hadoop 2.6
From: Billy Watson <williamrwatson@gmail.com>
To: "user@hadoop.apache.org" <user@hadoop.apache.org>
Content-Type: multipart/alternative; boundary=001a11c330d029a282051427bf06

--001a11c330d029a282051427bf06
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi,

I am able to run a `hadoop fs -ls s3n://my-s3-bucket` from the command line
without issue. I have set some options in hadoop-env.sh to make sure all
the S3 stuff for hadoop 2.6 is set up correctly. (This was very confusing,
BTW and not enough searchable documentation on changes to the s3 stuff in
hadoop 2.6 IMHO).

Anyways, when I run a pig job which accesses s3, it gets to 16%, does not
fail in pig, but rather fails in mapreduce with "Error:
java.io.IOException: No FileSystem for scheme: s3n.=E2=80=9D

I have added [hadoop-install-loc]/lib and
[hadoop-install-loc]/share/hadoop/tools/lib/ to the HADOOP_CLASSPATH env
variable in hadoop-env.sh.erb. When I do not do this, the pig job will fail
at 0% (before it ever gets to mapreduce) with a very similar =E2=80=9CNo fi=
leystem
for scheme s3n=E2=80=9D error.

I feel like at this point I just have to add the share/hadoop/tools/lib
directory (and maybe lib) to the right environment variable, but I can=E2=
=80=99t
figure out which environment variable that should be.

I appreciate any help, thanks!!


Stack trace:
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584) at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) at
org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) at
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612) at
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) at
org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInp=
utFormat.java:498)
at
org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInp=
utFormat.java:467)
at
org.apache.pig.piggybank.storage.CSVExcelStorage.setLocation(CSVExcelStorag=
e.java:609)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat=
.mergeSplitSpecificConf(PigInputFormat.java:129)
at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat=
.createRecordReader(PigInputFormat.java:103)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.jav=
a:512)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:755) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j=
ava:1628)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)


=E2=80=94 Billy Watson

--=20
William Watson
Software Engineer
(904) 705-7056 PCS

--001a11c330d029a282051427bf06
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi,<br><br>I am able to run a `hadoop fs -ls s3n://my-s3-bucket` from the c=
ommand line without issue. I have set some options in hadoop-env.sh to make=
 sure all the S3 stuff for hadoop 2.6 is set up correctly. (This was very c=
onfusing, BTW and not enough searchable documentation on changes to the s3 =
stuff in hadoop 2.6 IMHO).<br><br>Anyways, when I run a pig job which acces=
ses s3, it gets to 16%, does not fail in pig, but rather fails in mapreduce=
 with &quot;Error: java.io.IOException: No FileSystem for scheme: s3n.=E2=
=80=9D <br><br>I have added [hadoop-install-loc]/lib and [hadoop-install-lo=
c]/share/hadoop/tools/lib/ to the HADOOP_CLASSPATH env variable in hadoop-e=
nv.sh.erb. When I do not do this, the pig job will fail at 0% (before it ev=
er gets to mapreduce) with a very similar =E2=80=9CNo fileystem for scheme =
s3n=E2=80=9D error.<br><br>I feel like at this point I just have to add the=
 share/hadoop/tools/lib directory (and maybe lib) to the right environment =
variable, but I can=E2=80=99t figure out which environment variable that sh=
ould be.<br><br>I appreciate any help, thanks!!<br> <br><br>Stack trace:<br=
>org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584) a=
t org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) at=
 org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) at org.apac=
he.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) at org.apac=
he.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612) at org.apache.hadoo=
p.fs.FileSystem.get(FileSystem.java:370) at org.apache.hadoop.fs.Path.getFi=
leSystem(Path.java:296) at org.apache.hadoop.mapreduce.lib.input.FileInputF=
ormat.setInputPaths(FileInputFormat.java:498) at org.apache.hadoop.mapreduc=
e.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:467) at org.=
apache.pig.piggybank.storage.CSVExcelStorage.setLocation(CSVExcelStorage.ja=
va:609) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Pig=
InputFormat.mergeSplitSpecificConf(PigInputFormat.java:129) at org.apache.p=
ig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecor=
dReader(PigInputFormat.java:103) at org.apache.hadoop.mapred.MapTask$NewTra=
ckingRecordReader.&lt;init&gt;(MapTask.java:512) at org.apache.hadoop.mapre=
d.MapTask.runNewMapper(MapTask.java:755) at org.apache.hadoop.mapred.MapTas=
k.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChi=
ld.java:163) at java.security.AccessController.doPrivileged(Native Method) =
at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.=
security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.a=
pache.hadoop.mapred.YarnChild.main(YarnChild.java:158)<br><br><br>=E2=80=94=
 Billy Watson<br><br>-- <br><div dir=3D"ltr"><div><div dir=3D"ltr">William =
Watson<br>Software Engineer<div>(904) 705-7056 PCS</div></div></div></div><=
br>

--001a11c330d029a282051427bf06--