spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacob Kim <>
Subject RE: Using Spark on Azure Blob Storage
Date Thu, 25 Jun 2015 22:38:01 GMT
Below is the link for step by step guide in how to setup and use Spark in HDInsight.


From: Daniel Haviv []
Sent: Thursday, June 25, 2015 3:19 PM
To: Silvio Fiorito
Subject: Re: Using Spark on Azure Blob Storage

Thank you guys for the helpful answers.


On 25 ביוני 2015, at 21:23, Silvio Fiorito <<>>
Hi Daniel,

As Peter pointed out you need the hadoop-azure JAR as well as the Azure storage SDK for Java
( Even though the WASB driver is built for 2.7, I was still
able to use the hadoop-azure JAR with Spark built for older Hadoop versions, back to 2.4 I

Also, be sure to set your Storage Account key in your Spark Hadoop config, typically in core-site.xml:

  <value>{storage key here}</value>

As a heads up I have a couple projects for Spark on Azure. One is to push data to the Power
BI service (both batch and streaming) and I’m finishing up on another project for using
Event Hubs as well. The Power BI library is up at
the Event Hubs library should be up soon.


From: Daniel Haviv
Date: Thursday, June 25, 2015 at 1:37 PM
To: "<>"
Subject: Using Spark on Azure Blob Storage

I'm trying to use spark over Azure's HDInsight but the spark-shell fails when starting: No FileSystem for scheme: wasb
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(
        at org.apache.hadoop.fs.FileSystem.createFileSystem(
        at org.apache.hadoop.fs.FileSystem.access$200(
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(

Is Azure's blob storage supported ?

View raw message