www-announce mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sally Khudairi ...@apache.org>
Subject The Apache Software Foundation Announces Apache™ Spark™ v1.0
Date Fri, 30 May 2014 10:00:54 GMT
>> NOTE: this announcement is also available online at http://s.apache.org/VEc

Open Source large-scale, flexible, "Hadoop Swiss Army Knife" cluster computing framework offers
enhanced data analysis and richer integration with other Apache projects 

Forest Hill, MD –30 May 2014– The Apache Software Foundation (ASF), the all-volunteer
developers, stewards, and incubators of more than 170 Open Source projects and initiatives,
announced today the availability of Apache Spark v1.0, the super-fast, Open Source large-scale
data processing and advanced analytics engine. 

Apache Spark has been dubbed a "Hadoop Swiss Army knife" for its remarkable speed and ease
of use, allowing developers to quickly write applications in Java, Scala, or Python, using
its built-in set of over 80 high-level operators. With Spark, programs can run up to 100x
faster than Apache Hadoop MapReduce in memory. 

"1.0 is a huge milestone for the fast-growing Spark community. Every contributor and user
who's helped bring Spark to this point should feel proud of this release," said Matei Zaharia,
Vice President of Apache Spark. 

Apache Spark is well-suited for machine learning,  interactive queries, and stream processing.
It is 100% compatible with Hadoop’s Distributed File System (HDFS), HBase, Cassandra, as
well as any Hadoop storage system, making existing data immediately usable in Spark. In addition,
Spark supports SQL queries, streaming data, and complex analytics such as machine learning
and graph algorithms out-of-the-box. 

New in v1.0, Apache Spark offers strong API stability guarantees (backward-compatibility throughout
the 1.X series), a new Spark SQL component for accessing structured data, as well as richer
integration with other Apache projects (Hadoop YARN, Hive, and Mesos). 

Patrick Wendell, software engineer at Databricks and Apache Spark 1.0 release manager explained,
"In addition to providing long-term stability for Spark's core APIs, this release contains
a several new features. Spark 1.0 adds a unified submission tool for deploying applications
on a local machine, Mesos, YARN, or a dedicated cluster. We've added a new module, Spark SQL,
to provide schema-aware data modeling and SQL language support in Spark. Spark's machine learning
library, MLLib, has been enhanced with several new algorithms. Spark’s streaming and graph
libraries have also seen major updates. Across the board, we've focused on building tools
to empower the data scientists, statisticians and engineers who must grapple with large data
sets every day." 

Spark was originally developed at UC Berkeley AMP Lab, and its ease of use has made it a go-to
solution for both small and large enterprise environments across a wide range of industries,
including Alibaba, ClearStory Data, Cloudera, Databricks, IBM, Intel, MapR, Ooyala, and Yahoo,
among others. Not only are organizations rapidly adopting and deploying Apache Spark, many
contributors are committing code to the project as well. 

"Apache Spark is an important big data technology in delivering a high performance analytics
solution for the IT industry and satisfying the fast-growing customer demand," said Michael
Greene, Vice President and General Manager of System Technologies and Optimization at Intel.
"Intel is proud to participate in its development and we congratulate the community on this
release." 

"At NASA, we're really excited to leverage Spark and its highly interactive analytic capabilities
and the speedups offered by 1.0 along with Spark SQL are going to help out critical projects
looking at measurement of Snow in the Western US and also on projects related to Regional
Climate Modeling and in Model Evaluation for the U.S. National Climate Assessment related
Activities," said Chris Mattmann, an ASF Director, Chief Architect, Instrument and Science
Data Systems Section at NASA JPL, and Adjunct Associate Professor at the University of Southern
California. "I'm looking forward to designing Spark-related projects in my Software Architectures
and in my Search Engines courses at USC as well. The community is one of our most active at
the ASF and the interest has really peaked and these guys are doing a great job." 

"We're continuing to see very fast growth — 102 individuals have contributed patches to
this release over the past four months, which is our highest number of contributors ever,"
added Zaharia. 

Availability and Oversight
As with all Apache products, Apache Spark software is released under the Apache License v2.0,
and is overseen by a self-selected team of active contributors to the project. A Project Management
Committee (PMC) guides the Project’s day-to-day operations, including community development
and product releases. For documentation and ways to become involved with Apache Spark, visit
http://spark.apache.org/ 

About The Apache Software Foundation (ASF)
Established in 1999, the all-volunteer Foundation oversees more than one hundred and seventy
leading Open Source projects, including Apache HTTP Server --the world's most popular Web
server software. Through the ASF's meritocratic process known as "The Apache Way," more than
400 individual Members and 3,500 Committers successfully collaborate to develop freely available
enterprise-grade software, benefiting millions of users worldwide: thousands of software solutions
are distributed under the Apache License; and the community actively participates in ASF mailing
lists, mentoring initiatives, and ApacheCon, the Foundation's official user conference, trainings,
and expo. The ASF is a US 501(c)(3) charitable organization, funded by individual donations
and corporate sponsors including Budget Direct, Citrix, Cloudera, Comcast, Facebook, Google,
Hortonworks, HP, Huawei, IBM, InMotion Hosting, Matt Mullenweg, Microsoft, Pivotal, Produban,
WANdisco, and Yahoo.
For more information, visit http://www.apache.org/ or follow @TheASF on Twitter. 

"Apache", "Spark", "Apache Spark", and "ApacheCon" are trademarks of The Apache Software Foundation.
All other brands and trademarks are the property of their respective owners.

# # #

NOTE: you are receiving this message because you are subscribed to the announce@apache.org
distribution list. To unsubscribe, send email from the recipient account to announce-unsubscribe@apache.org
with the word "Unsubscribe" in the subject line. 

Mime
View raw message