Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 07B1517BDA for ; Mon, 20 Apr 2015 13:59:59 +0000 (UTC) Received: (qmail 81359 invoked by uid 500); 20 Apr 2015 13:59:53 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 81269 invoked by uid 500); 20 Apr 2015 13:59:53 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 80921 invoked by uid 99); 20 Apr 2015 13:59:53 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Apr 2015 13:59:53 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: error (athena.apache.org: local policy) Received: from [54.191.145.13] (HELO mx1-us-west.apache.org) (54.191.145.13) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Apr 2015 13:59:47 +0000 Received: from mail.eagleeyet.net (eagleeyet.net [176.58.122.69]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 7FB58251AF for ; Mon, 20 Apr 2015 13:59:06 +0000 (UTC) Received: by mail.eagleeyet.net (Postfix, from userid 33) id 4658318549; Mon, 20 Apr 2015 13:58:28 +0000 (UTC) To: user@hadoop.apache.org Subject: Re: Unable to Find S3N Filesystem Hadoop 2.6 X-PHP-Originating-Script: 0:rcube.php MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=_2d147c4c2b749508bed39b4702c331a0" Date: Mon, 20 Apr 2015 15:58:28 +0200 From: Jonathan Aquilina In-Reply-To: References: Message-ID: X-Sender: jaquilina@eagleeyet.net User-Agent: Roundcube Webmail/1.0.5 X-Virus-Checked: Checked by ClamAV on apache.org --=_2d147c4c2b749508bed39b4702c331a0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII you mention an environmental variable. the step before you specify the steps to run to get to the result. you can specify a bash script that will allow you to put any 3rd party jar files, for us we used esri, on the cluster and propagate them to all nodes in the cluster as well. You can ping me off list if you need further help. Thing is I havent used pig but my boss and coworker wrote the mappers and reducers. to get these jars to the entire cluster was a super small and simple bash script. --- Regards, Jonathan Aquilina Founder Eagle Eye T On 2015-04-20 15:17, Billy Watson wrote: > Hi, > > I am able to run a `hadoop fs -ls s3n://my-s3-bucket` from the command line without issue. I have set some options in hadoop-env.sh to make sure all the S3 stuff for hadoop 2.6 is set up correctly. (This was very confusing, BTW and not enough searchable documentation on changes to the s3 stuff in hadoop 2.6 IMHO). > > Anyways, when I run a pig job which accesses s3, it gets to 16%, does not fail in pig, but rather fails in mapreduce with "Error: java.io.IOException: No FileSystem for scheme: s3n." > > I have added [hadoop-install-loc]/lib and [hadoop-install-loc]/share/hadoop/tools/lib/ to the HADOOP_CLASSPATH env variable in hadoop-env.sh.erb. When I do not do this, the pig job will fail at 0% (before it ever gets to mapreduce) with a very similar "No fileystem for scheme s3n" error. > > I feel like at this point I just have to add the share/hadoop/tools/lib directory (and maybe lib) to the right environment variable, but I can't figure out which environment variable that should be. > > I appreciate any help, thanks!! > > Stack trace: > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370) at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:498) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:467) at org.apache.pig.piggybank.storage.CSVExcelStorage.setLocation(CSVExcelStorage.java:609) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.mergeSplitSpecificConf(PigInputFormat.java:129) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:103) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:512) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:755) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > > -- Billy Watson > > -- > > William Watson > Software Engineer > (904) 705-7056 PCS --=_2d147c4c2b749508bed39b4702c331a0 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=UTF-8

you mention an environmental variable. the step before you specify the s= teps to run to get to the result. you can specify a bash script that will a= llow you to put any 3rd party jar files, for us we used esri, on the cluste= r and propagate them to all nodes in the cluster as well. You can ping me o= ff list if you need further help. Thing is I havent used pig but my boss an= d coworker wrote the mappers and reducers. to get these jars to the entire = cluster was a super small and simple bash script.

 

---
Regards, Jonathan Aquilina Founder Eagle Eye T

On 2015-04-20 15:17, Billy Watson wrote:

Hi,

I am able to run a `hadoop fs -ls s3n://my-s3-bucket` fro= m the command line without issue. I have set some options in hadoop-env.sh = to make sure all the S3 stuff for hadoop 2.6 is set up correctly. (This was= very confusing, BTW and not enough searchable documentation on changes to = the s3 stuff in hadoop 2.6 IMHO).

Anyways, when I run a pig job = which accesses s3, it gets to 16%, does not fail in pig, but rather fails i= n mapreduce with "Error: java.io.IOException: No FileSystem for scheme: s3n= =2E"

I have added [hadoop-install-loc]/lib and [hadoop-install-= loc]/share/hadoop/tools/lib/ to the HADOOP_CLASSPATH env variable in hadoop= -env.sh.erb. When I do not do this, the pig job will fail at 0% (before it = ever gets to mapreduce) with a very similar "No fileystem for scheme s3n" e= rror.

I feel like at this point I just have to add the share/had= oop/tools/lib directory (and maybe lib) to the right environment variable, = but I can't figure out which environment variable that should be.

I appreciate any help, thanks!!


Stack trace:
org.ap= ache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584) at org= =2Eapache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591) at or= g.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91) at org.apache= =2Ehadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630) at org.apac= he.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612) at org.apache.hadoo= p.fs.FileSystem.get(FileSystem.java:370) at org.apache.hadoop.fs.Path.getFi= leSystem(Path.java:296) at org.apache.hadoop.mapreduce.lib.input.FileInputF= ormat.setInputPaths(FileInputFormat.java:498) at org.apache.hadoop.mapreduc= e.lib.input.FileInputFormat.setInputPaths(FileInputFormat.java:467) at org= =2Eapache.pig.piggybank.storage.CSVExcelStorage.setLocation(CSVExcelStorage= =2Ejava:609) at org.apache.pig.backend.hadoop.executionengine.mapReduceLaye= r.PigInputFormat.mergeSplitSpecificConf(PigInputFormat.java:129) at org.apa= che.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.create= RecordReader(PigInputFormat.java:103) at org.apache.hadoop.mapred.MapTask$N= ewTrackingRecordReader.<init>(MapTask.java:512) at org.apache.hadoop= =2Emapred.MapTask.runNewMapper(MapTask.java:755) at org.apache.hadoop.mapre= d.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run= (YarnChild.java:163) at java.security.AccessController.doPrivileged(Native = Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache= =2Ehadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628= ) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)


— Billy Watson

--

William Watson
Software Engineer
(904) 705-7056 PCS
--=_2d147c4c2b749508bed39b4702c331a0--