spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Mahler <dmah...@gmail.com>
Subject Spark, S3 & jets3t errors
Date Tue, 03 Sep 2013 05:09:00 GMT
I am trying process data from S3 using Spark.
I am using spark clusters initialized using scripts from AmpCamp3.

The problem is that Spark does not seem to see the the jets3t package
even though it is a dependency of Hadoop, which is a dependency of Spark.
There are (multiple) jets3t jars on the AmpCamp3 AMIs.

I have reported this problem on the AmpCamp 3 Piazza site
https://piazza.com/class/hkvq2k9dcx52yn?cid=120
and on StackOverflow
http://stackoverflow.com/questions/18543894/error-when-reading-from-s3-using-spark-hadoop.
I have not received any responses yet, but I narrowed down the problem
to the following:

$ ssh -i ~/Dropbox/dmahler2.pem
> root@ec2-54-235-18-51.compute-1.amazonaws.com
> Last login: Sat Aug 31 02:13:16 2013 from
> 107-216-44-11.lightspeed.austtx.sbcglobal.net
>       ____              __
>      / __/__  ___ _____/ /__
>     _\ \/ _ \/ _ `/ __/  '_/
>    /___/ .__/\_,_/_/ /_/\_\   version 0.7.2
>       /_/
>        __|  __|_  )
>        _|  (     /   Amazon Linux AMI
>       ___|\___|___|
> ...
> scala> import org.jets3t
> <console>:10: error: jets3t is not a member of org
>        import org.jets3t


Explicitly adding the jets3t jar to the classpath prior to starting
spark-shell does not help.

thanks
Daniel

Mime
View raw message