spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Rudenko <petro.rude...@gmail.com>
Subject Re: Spark 1.3.1 / Hadoop 2.6 package has broken S3 access
Date Thu, 07 May 2015 16:48:48 GMT
Yep it's a Hadoop issue: https://issues.apache.org/jira/browse/HADOOP-11863

http://mail-archives.apache.org/mod_mbox/hadoop-user/201504.mbox/%3CCA+XUwYxPxLkfhOxn1jNkoUKEQQMcPWFzvXJ=u+kP28KDEjO4GQ@mail.gmail.com%3E
http://stackoverflow.com/a/28033408/3271168


So for now need to manually add that jar to classpath on hadoop-2.6.

Thanks,
Peter Rudenko

On 2015-05-07 19:41, Nicholas Chammas wrote:
> I can try that, but the issue is I understand this is supposed to work 
> out of the box (like it does with all the other Spark/Hadoop pre-built 
> packages).
>
> On Thu, May 7, 2015 at 12:35 PM Peter Rudenko <petro.rudenko@gmail.com 
> <mailto:petro.rudenko@gmail.com>> wrote:
>
>     Try to download this jar:
>     http://search.maven.org/remotecontent?filepath=org/apache/hadoop/hadoop-aws/2.6.0/hadoop-aws-2.6.0.jar
>
>     And add:
>
>     export CLASSPATH=$CLASSPATH:hadoop-aws-2.6.0.jar
>
>     And try to relaunch.
>
>     Thanks,
>     Peter Rudenko
>
>
>     On 2015-05-07 19:30, Nicholas Chammas wrote:
>>
>>     Hmm, I just tried changing |s3n| to |s3a|:
>>
>>     |py4j.protocol.Py4JJavaError: An error occurred while calling
>>     z:org.apache.spark.api.python.PythonRDD.collectAndServe. :
>>     java.lang.RuntimeException: java.lang.ClassNotFoundException:
>>     Class org.apache.hadoop.fs.s3a.S3AFileSystem not found |
>>
>>     Nick
>>
>>     ‚Äč
>>
>>     On Thu, May 7, 2015 at 12:29 PM Peter Rudenko
>>     <petro.rudenko@gmail.com <mailto:petro.rudenko@gmail.com>> wrote:
>>
>>         Hi Nick, had the same issue.
>>         By default it should work with s3a protocol:
>>
>>         sc.textFile('s3a://bucket/file_*').count()
>>
>>
>>         If you want to use s3n protocol you need to add
>>         hadoop-aws.jar to spark's classpath. Wich hadoop vendor
>>         (Hortonworks, Cloudera, MapR) do you use?
>>
>>         Thanks,
>>         Peter Rudenko
>>
>>         On 2015-05-07 19:25, Nicholas Chammas wrote:
>>>         Details are here:https://issues.apache.org/jira/browse/SPARK-7442
>>>
>>>         It looks like something specific to building against Hadoop 2.6?
>>>
>>>         Nick
>>>
>>
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message