spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denny Lee <>
Subject RE: Announcing Spark 1.1.0!
Date Fri, 12 Sep 2014 03:03:30 GMT
Yes, atleast for my query scenarios, I have been able to use Spark 1.1 with Hadoop 2.4 against
Hadoop 2.5.  Note, Hadoop 2.5 is considered a relatively minor release (
where Hadoop 2.4 and 2.3 were considered more significant releases.

On September 11, 2014 at 19:22:05, Haopu Wang ( wrote:

From the web page ( which is
pointed out by you, it’s saying “Because HDFS is not protocol-compatible across versions,
if you want to read from HDFS, you’ll need to build Spark against the specific HDFS version
in your environment.”


Did you try to read a hadoop 2.5.0 file using Spark 1.1 with hadoop 2.4?




From:Denny Lee []
Sent: Friday, September 12, 2014 10:00 AM
To: Patrick Wendell; Haopu Wang;;
Subject: RE: Announcing Spark 1.1.0!


Please correct me if I’m wrong but I was under the impression as per the maven repositories
that it was just to stay more in sync with the various version of Hadoop.  Looking at the
latest documentation (, there
are multiple Hadoop versions called out.


As for the potential differences in Spark, this is more about ensuring the various jars and
library dependencies of the correct version of Hadoop are included so there can be proper
connectivity to Hadoop from Spark vs. any differences in Spark itself.   Another good reference
on this topic is call out for Hadoop versions within github:





On September 11, 2014 at 18:39:10, Haopu Wang ( wrote:

Danny, thanks for the response.


I raise the question because in Spark 1.0.2, I saw one binary package for hadoop2, but in
Spark 1.1.0, there are separate packages for hadoop 2.3 and 2.4.

That implies some difference in Spark according to hadoop version.


From:Denny Lee []
Sent: Friday, September 12, 2014 9:35 AM
To:; Haopu Wang;; Patrick Wendell
Subject: RE: Announcing Spark 1.1.0!


I’m not sure if I’m completely answering your question here but I’m currently working
(on OSX) with Hadoop 2.5 and I used the Spark 1.1 with Hadoop 2.4 without any issues.



On September 11, 2014 at 18:11:46, Haopu Wang ( wrote:

I see the binary packages include hadoop 1, 2.3 and 2.4.
Does Spark 1.1.0 support hadoop 2.5.0 at below address?

-----Original Message-----
From: Patrick Wendell []
Sent: Friday, September 12, 2014 8:13 AM
Subject: Announcing Spark 1.1.0!

I am happy to announce the availability of Spark 1.1.0! Spark 1.1.0 is
the second release on the API-compatible 1.X line. It is Spark's
largest release ever, with contributions from 171 developers!

This release brings operational and performance improvements in Spark
core including a new implementation of the Spark shuffle designed for
very large scale workloads. Spark 1.1 adds significant extensions to
the newest Spark modules, MLlib and Spark SQL. Spark SQL introduces a
JDBC server, byte code generation for fast expression evaluation, a
public types API, JSON support, and other features and optimizations.
MLlib introduces a new statistics library along with several new
algorithms and optimizations. Spark 1.1 also builds out Spark's Python
support and adds new components to the Spark Streaming module.

Visit the release notes [1] to read about the new features, or
download [2] the release today.



Please e-mail me directly for any type-o's in the release notes or name listing.

Thanks, and congratulations!
- Patrick

To unsubscribe, e-mail:
For additional commands, e-mail:
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message