spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacob Kim <jac...@microsoft.com>
Subject RE: Using Spark on Azure Blob Storage
Date Thu, 25 Jun 2015 22:38:01 GMT
Below is the link for step by step guide in how to setup and use Spark in HDInsight.

https://azure.microsoft.com/en-us/documentation/articles/hdinsight-hadoop-spark-install/

Jacob

From: Daniel Haviv [mailto:daniel.haviv@veracity-group.com]
Sent: Thursday, June 25, 2015 3:19 PM
To: Silvio Fiorito
Cc: user@spark.apache.org
Subject: Re: Using Spark on Azure Blob Storage

Thank you guys for the helpful answers.

Daniel

On 25 ביוני 2015, at 21:23, Silvio Fiorito <silvio.fiorito@granturing.com<mailto:silvio.fiorito@granturing.com>>
wrote:
Hi Daniel,

As Peter pointed out you need the hadoop-azure JAR as well as the Azure storage SDK for Java
(com.microsoft.azure:azure-storage). Even though the WASB driver is built for 2.7, I was still
able to use the hadoop-azure JAR with Spark built for older Hadoop versions, back to 2.4 I
think.

Also, be sure to set your Storage Account key in your Spark Hadoop config, typically in core-site.xml:

<property>
  <name>fs.azure.account.key.{accountname}.blob.core.windows.net<http://blob.core.windows.net></name>
  <value>{storage key here}</value>
</property>

As a heads up I have a couple projects for Spark on Azure. One is to push data to the Power
BI service (both batch and streaming) and I’m finishing up on another project for using
Event Hubs as well. The Power BI library is up at http://spark-packages.org/package/granturing/spark-power-bi
the Event Hubs library should be up soon.

Thanks,
Silvio

From: Daniel Haviv
Date: Thursday, June 25, 2015 at 1:37 PM
To: "user@spark.apache.org<mailto:user@spark.apache.org>"
Subject: Using Spark on Azure Blob Storage

Hi,
I'm trying to use spark over Azure's HDInsight but the spark-shell fails when starting:
java.io.IOException: No FileSystem for scheme: wasb
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)

Is Azure's blob storage supported ?

Thanks,
Daniel
Mime
View raw message