spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a...@apache.org
Subject svn commit: r1600800 [2/2] - in /spark: ./ site/ site/news/ site/releases/
Date Fri, 06 Jun 2014 00:55:59 GMT
Modified: spark/site/releases/spark-release-0-8-1.html
URL: http://svn.apache.org/viewvc/spark/site/releases/spark-release-0-8-1.html?rev=1600800&r1=1600799&r2=1600800&view=diff
==============================================================================
--- spark/site/releases/spark-release-0-8-1.html (original)
+++ spark/site/releases/spark-release-0-8-1.html Fri Jun  6 00:55:58 2014
@@ -163,7 +163,7 @@
 <p>Apache Spark 0.8.1 is a maintenance and performance release for the Scala 2.9 version
of Spark. It also adds several new features, such as standalone mode high availability, that
will appear in Spark 0.9 but developers wanted to have in Scala 2.9. Contributions to 0.8.1
came from 41 developers.</p>
 
 <h3 id="yarn-22-support">YARN 2.2 Support</h3>
-<p>Support has been added for running Spark on YARN 2.2 and newer. Due to a change
in the YARN API between previous versions and 2.2+, this was not supported in Spark 0.8.0.
See the <a href="/docs/0.8.1/running-on-yarn.html">YARN documentation</a> for
specific instructions on how to build Spark for YARN 2.2+. We’ve also included a pre-compiled
binary for YARN 2.2.</p>
+<p>Support has been added for running Spark on YARN 2.2 and newer. Due to a change
in the YARN API between previous versions and 2.2+, this was not supported in Spark 0.8.0.
See the <a href="/docs/0.8.1/running-on-yarn.html">YARN documentation</a> for
specific instructions on how to build Spark for YARN 2.2+. We&#8217;ve also included a
pre-compiled binary for YARN 2.2.</p>
 
 <h3 id="high-availability-mode-for-standalone-cluster-manager">High Availability Mode
for Standalone Cluster Manager</h3>
 <p>The standalone cluster manager now has a high availability (H/A) mode which can
tolerate master failures. This is particularly useful for long-running applications such as
streaming jobs and the shark server, where the scheduler master previously represented a single
point of failure. Instructions for deploying H/A mode are included <a href="/docs/0.8.1/spark-standalone.html#high-availability">in
the documentation</a>. The current implementation uses Zookeeper for coordination.</p>
@@ -174,7 +174,7 @@
 <ul>
   <li>Optimized hashtables for shuffle data - reduces memory and CPU consumption</li>
   <li>Efficient encoding for JobConfs - improves latency for stages reading large numbers
of blocks from HDFS, S3, and HBase</li>
-  <li>Shuffle file consolidation (off by default) - reduces the number of files created
in large shuffles for better filesystem performance. This change works best on filesystems
newer than ext3 (we recommend ext4 or XFS), and it will be the default in Spark 0.9, but we’ve
left it off by default for compatibility. We recommend users turn this on unless they are
using ext3 by setting <code>spark.shuffle.consolidateFiles</code> to “true”.</li>
+  <li>Shuffle file consolidation (off by default) - reduces the number of files created
in large shuffles for better filesystem performance. This change works best on filesystems
newer than ext3 (we recommend ext4 or XFS), and it will be the default in Spark 0.9, but we’ve
left it off by default for compatibility. We recommend users turn this on unless they are
using ext3 by setting <code>spark.shuffle.consolidateFiles</code> to &#8220;true&#8221;.</li>
   <li>Torrent broadcast (off by default) - a faster broadcast implementation for large
objects.</li>
   <li>Support for fetching large result sets - allows tasks to return large results
without tuning Akka buffer sizes.</li>
 </ul>
@@ -211,47 +211,47 @@
 <h3 id="credits">Credits</h3>
 
 <ul>
-  <li>Michael Armbrust – build fix</li>
-  <li>Pierre Borckmans – typo fix in documentation</li>
-  <li>Evan Chan – <code>local://</code> scheme for dependency jars</li>
-  <li>Ewen Cheslack-Postava – <code>add</code> method for python accumulators,
support for setting config properties in python</li>
-  <li>Mosharaf Chowdhury – optimized broadcast implementation</li>
-  <li>Frank Dai – documentation fix</li>
-  <li>Aaron Davidson – shuffle file consolidation, H/A mode for standalone scheduler,
cleaned up representation of block IDs, several improvements and bug fixes</li>
-  <li>Tathagata Das – new streaming operators, fix for kafka concurrency bug</li>
-  <li>Ankur Dave – support for pausing spot clusters on EC2</li>
-  <li>Harvey Feng – optimization to JobConf broadcasts, bug fixes, YARN 2.2 build</li>
-  <li>Ali Ghodsi – YARN 2.2 build</li>
-  <li>Thomas Graves – Spark YARN integration including secure HDFS access over
YARN</li>
-  <li>Li Guoqiang – fix for Maven build</li>
-  <li>Stephen Haberman – bug fix</li>
-  <li>Haidar Hadi – documentation fix</li>
-  <li>Nathan Howell – bug fix relating to YARN</li>
-  <li>Holden Karau – Java version of <code>mapPartitionsWithIndex</code></li>
-  <li>Du Li – bug fix in make-distrubion.sh</li>
-  <li>Raymond Liu – work on YARN 2.2 build</li>
-  <li>Xi Liu – bug fix and code clean-up</li>
-  <li>David McCauley – bug fix in standalone mode JSON output</li>
-  <li>Michael (wannabeast) – bug fix in memory store</li>
-  <li>Fabrizio Milo – typos in documentation, clean-up in DAGScheduler, typo in
scaladoc</li>
-  <li>Mridul Muralidharan – fixes to metadata cleaner and speculative execution</li>
-  <li>Sundeep Narravula – build fix, bug fixes in scheduler and tests, code clean-up</li>
-  <li>Kay Ousterhout – optimized result fetching, new information in UI, scheduler
clean-up and bug fixes</li>
-  <li>Nick Pentreath – implicit feedback variant of ALS algorithm</li>
-  <li>Imran Rashid – improvement to executor launch</li>
-  <li>Ahir Reddy – spark support for SIMR</li>
-  <li>Josh Rosen – memory use optimization, clean up of BlockManager code, Java
and Python clean-up/fixes</li>
-  <li>Henry Saputra – build fix</li>
-  <li>Jerry Shao – refactoring of fair scheduler, support for running Spark as
a specific user, bug fix</li>
-  <li>Mingfei Shi – documentation for JobLogger</li>
-  <li>Andre Schumacher – sortByKey in PySpark and associated changes</li>
-  <li>Karthik Tunga – bug fix in launch script</li>
-  <li>Patrick Wendell – <code>repartition</code> operator, shuffle
write metrics, various fixes and release management</li>
-  <li>Neal Wiggins – import clean-up, documentation fixes</li>
-  <li>Andrew Xia – bug fix in UI</li>
-  <li>Reynold Xin – task killing, support for setting job properties in Spark
shell, logging improvements, Kryo improvements, several bug fixes</li>
-  <li>Matei Zaharia – optimized hashmap for shuffle data, PySpark documentation,
optimizations to Kryo serializer</li>
-  <li>Wu Zeming – bug fix in executors UI</li>
+  <li>Michael Armbrust &#8211; build fix</li>
+  <li>Pierre Borckmans &#8211; typo fix in documentation</li>
+  <li>Evan Chan &#8211; <code>local://</code> scheme for dependency
jars</li>
+  <li>Ewen Cheslack-Postava &#8211; <code>add</code> method for python
accumulators, support for setting config properties in python</li>
+  <li>Mosharaf Chowdhury &#8211; optimized broadcast implementation</li>
+  <li>Frank Dai &#8211; documentation fix</li>
+  <li>Aaron Davidson &#8211; shuffle file consolidation, H/A mode for standalone
scheduler, cleaned up representation of block IDs, several improvements and bug fixes</li>
+  <li>Tathagata Das &#8211; new streaming operators, fix for kafka concurrency
bug</li>
+  <li>Ankur Dave &#8211; support for pausing spot clusters on EC2</li>
+  <li>Harvey Feng &#8211; optimization to JobConf broadcasts, bug fixes, YARN 2.2
build</li>
+  <li>Ali Ghodsi &#8211; YARN 2.2 build</li>
+  <li>Thomas Graves &#8211; Spark YARN integration including secure HDFS access
over YARN</li>
+  <li>Li Guoqiang &#8211; fix for Maven build</li>
+  <li>Stephen Haberman &#8211; bug fix</li>
+  <li>Haidar Hadi &#8211; documentation fix</li>
+  <li>Nathan Howell &#8211; bug fix relating to YARN</li>
+  <li>Holden Karau &#8211; Java version of <code>mapPartitionsWithIndex</code></li>
+  <li>Du Li &#8211; bug fix in make-distrubion.sh</li>
+  <li>Raymond Liu &#8211; work on YARN 2.2 build</li>
+  <li>Xi Liu &#8211; bug fix and code clean-up</li>
+  <li>David McCauley &#8211; bug fix in standalone mode JSON output</li>
+  <li>Michael (wannabeast) &#8211; bug fix in memory store</li>
+  <li>Fabrizio Milo &#8211; typos in documentation, clean-up in DAGScheduler, typo
in scaladoc</li>
+  <li>Mridul Muralidharan &#8211; fixes to metadata cleaner and speculative execution</li>
+  <li>Sundeep Narravula &#8211; build fix, bug fixes in scheduler and tests, code
clean-up</li>
+  <li>Kay Ousterhout &#8211; optimized result fetching, new information in UI,
scheduler clean-up and bug fixes</li>
+  <li>Nick Pentreath &#8211; implicit feedback variant of ALS algorithm</li>
+  <li>Imran Rashid &#8211; improvement to executor launch</li>
+  <li>Ahir Reddy &#8211; spark support for SIMR</li>
+  <li>Josh Rosen &#8211; memory use optimization, clean up of BlockManager code,
Java and Python clean-up/fixes</li>
+  <li>Henry Saputra &#8211; build fix</li>
+  <li>Jerry Shao &#8211; refactoring of fair scheduler, support for running Spark
as a specific user, bug fix</li>
+  <li>Mingfei Shi &#8211; documentation for JobLogger</li>
+  <li>Andre Schumacher &#8211; sortByKey in PySpark and associated changes</li>
+  <li>Karthik Tunga &#8211; bug fix in launch script</li>
+  <li>Patrick Wendell &#8211; <code>repartition</code> operator, shuffle
write metrics, various fixes and release management</li>
+  <li>Neal Wiggins &#8211; import clean-up, documentation fixes</li>
+  <li>Andrew Xia &#8211; bug fix in UI</li>
+  <li>Reynold Xin &#8211; task killing, support for setting job properties in Spark
shell, logging improvements, Kryo improvements, several bug fixes</li>
+  <li>Matei Zaharia &#8211; optimized hashmap for shuffle data, PySpark documentation,
optimizations to Kryo serializer</li>
+  <li>Wu Zeming &#8211; bug fix in executors UI</li>
 </ul>
 
 <p>Thanks to everyone who contributed!</p>

Modified: spark/site/releases/spark-release-0-9-0.html
URL: http://svn.apache.org/viewvc/spark/site/releases/spark-release-0-9-0.html?rev=1600800&r1=1600799&r2=1600800&view=diff
==============================================================================
--- spark/site/releases/spark-release-0-9-0.html (original)
+++ spark/site/releases/spark-release-0-9-0.html Fri Jun  6 00:55:58 2014
@@ -265,87 +265,87 @@
 <p>The following developers contributed to this release:</p>
 
 <ul>
-  <li>Andrew Ash – documentation improvements</li>
-  <li>Pierre Borckmans – documentation fix</li>
-  <li>Russell Cardullo – graphite sink for metrics</li>
-  <li>Evan Chan – local:// URI feature</li>
-  <li>Vadim Chekan – bug fix</li>
-  <li>Lian Cheng – refactoring and code clean-up in several locations, bug fixes</li>
-  <li>Ewen Cheslack-Postava – Spark EC2 and PySpark improvements</li>
-  <li>Mosharaf Chowdhury – optimized broadcast</li>
-  <li>Dan Crankshaw – GraphX contributions</li>
-  <li>Haider Haidi – documentation fix</li>
-  <li>Frank Dai – Naive Bayes classifier in MLlib, documentation improvements</li>
-  <li>Tathagata Das – new operators, fixes, and improvements to Spark Streaming
(lead)</li>
-  <li>Ankur Dave – GraphX contributions</li>
-  <li>Henry Davidge – warning for large tasks</li>
-  <li>Aaron Davidson – shuffle file consolidation, H/A mode for standalone scheduler,
various improvements and fixes</li>
-  <li>Kyle Ellrott – GraphX contributions</li>
-  <li>Hossein Falaki – new statistical operators, Scala and Python examples in
MLlib</li>
-  <li>Harvey Feng – hadoop file optimizations and YARN integration</li>
-  <li>Ali Ghodsi – support for SIMR</li>
-  <li>Joseph E. Gonzalez – GraphX contributions</li>
-  <li>Thomas Graves – fixes and improvements for YARN support (lead)</li>
-  <li>Rong Gu – documentation fix</li>
-  <li>Stephen Haberman – bug fixes</li>
-  <li>Walker Hamilton – bug fix</li>
-  <li>Mark Hamstra – scheduler improvements and fixes, build fixes</li>
-  <li>Damien Hardy – Debian build fix</li>
-  <li>Nathan Howell – sbt upgrade</li>
-  <li>Grace Huang – improvements to metrics code</li>
-  <li>Shane Huang – separation of admin and user scripts:</li>
-  <li>Prabeesh K – MQTT integration for Spark Streaming and code fix</li>
-  <li>Holden Karau – sbt build improvements and Java API extensions</li>
-  <li>KarthikTunga – bug fix</li>
-  <li>Grega Kespret – bug fix</li>
-  <li>Marek Kolodziej – optimized random number generator</li>
-  <li>Jey Kottalam – EC2 script improvements</li>
-  <li>Du Li – bug fixes</li>
-  <li>Haoyuan Li – tachyon support in EC2</li>
-  <li>LiGuoqiang – fixes to build and YARN integration</li>
-  <li>Raymond Liu – build improvement and various fixes for YARN support</li>
-  <li>George Loentiev – Maven build fixes</li>
-  <li>Akihiro Matsukawa – GraphX contributions</li>
-  <li>David McCauley – improvements to json endpoint</li>
-  <li>Mike – bug fixes</li>
-  <li>Fabrizio (Misto) Milo – bug fix</li>
-  <li>Mridul Muralidharan – speculation improvements, several bug fixes</li>
-  <li>Tor Myklebust – Python mllib bindings, instrumentation for task serailization</li>
-  <li>Sundeep Narravula – bug fix</li>
-  <li>Binh Nguyen – Java API improvements and version upgrades</li>
-  <li>Adam Novak – bug fix</li>
-  <li>Andrew Or – external sorting</li>
-  <li>Kay Ousterhout – several bug fixes and improvements to Spark scheduler</li>
-  <li>Sean Owen – style fixes</li>
-  <li>Nick Pentreath – ALS implicit feedback algorithm</li>
-  <li>Pillis – <code>Vector.random()</code> method</li>
-  <li>Imran Rashid – bug fix</li>
-  <li>Ahir Reddy – support for SIMR</li>
-  <li>Luca Rosellini – script loading for Scala shell</li>
-  <li>Josh Rosen – fixes, clean-up, and extensions to scala and Java API’s</li>
-  <li>Henry Saputra – style improvements and clean-up</li>
-  <li>Andre Schumacher – Python improvements and bug fixes</li>
-  <li>Jerry Shao – multi-user support, various fixes and improvements</li>
-  <li>Prashant Sharma – Scala 2.10 support, configuration system, several smaller
fixes</li>
-  <li>Shiyun – style fix</li>
-  <li>Wangda Tan – UI improvement and bug fixes</li>
-  <li>Matthew Taylor – bug fix</li>
-  <li>Jyun-Fan Tsai – documentation fix</li>
-  <li>Takuya Ueshin – bug fix</li>
-  <li>Shivaram Venkataraman – sbt build optimization, EC2 improvements, Java and
Python API</li>
-  <li>Jianping J Wang – GraphX contributions</li>
-  <li>Martin Weindel – build fix</li>
-  <li>Patrick Wendell – standalone driver submission, various fixes, release manager</li>
-  <li>Neal Wiggins – bug fix</li>
-  <li>Andrew Xia – bug fixes and code cleanup</li>
-  <li>Reynold Xin – GraphX contributions, task killing, various fixes, improvements
and optimizations</li>
-  <li>Dong Yan – bug fix</li>
-  <li>Haitao Yao – bug fix</li>
-  <li>Xusen Yin – bug fix</li>
-  <li>Fengdong Yu – documentation fixes</li>
-  <li>Matei Zaharia – new configuration system, Python MLlib bindings, scheduler
improvements, various fixes and optimizations</li>
-  <li>Wu Zeming – bug fix</li>
-  <li>Nan Zhu – documentation improvements</li>
+  <li>Andrew Ash &#8211; documentation improvements</li>
+  <li>Pierre Borckmans &#8211; documentation fix</li>
+  <li>Russell Cardullo &#8211; graphite sink for metrics</li>
+  <li>Evan Chan &#8211; local:// URI feature</li>
+  <li>Vadim Chekan &#8211; bug fix</li>
+  <li>Lian Cheng &#8211; refactoring and code clean-up in several locations, bug
fixes</li>
+  <li>Ewen Cheslack-Postava &#8211; Spark EC2 and PySpark improvements</li>
+  <li>Mosharaf Chowdhury &#8211; optimized broadcast</li>
+  <li>Dan Crankshaw &#8211; GraphX contributions</li>
+  <li>Haider Haidi &#8211; documentation fix</li>
+  <li>Frank Dai &#8211; Naive Bayes classifier in MLlib, documentation improvements</li>
+  <li>Tathagata Das &#8211; new operators, fixes, and improvements to Spark Streaming
(lead)</li>
+  <li>Ankur Dave &#8211; GraphX contributions</li>
+  <li>Henry Davidge &#8211; warning for large tasks</li>
+  <li>Aaron Davidson &#8211; shuffle file consolidation, H/A mode for standalone
scheduler, various improvements and fixes</li>
+  <li>Kyle Ellrott &#8211; GraphX contributions</li>
+  <li>Hossein Falaki &#8211; new statistical operators, Scala and Python examples
in MLlib</li>
+  <li>Harvey Feng &#8211; hadoop file optimizations and YARN integration</li>
+  <li>Ali Ghodsi &#8211; support for SIMR</li>
+  <li>Joseph E. Gonzalez &#8211; GraphX contributions</li>
+  <li>Thomas Graves &#8211; fixes and improvements for YARN support (lead)</li>
+  <li>Rong Gu &#8211; documentation fix</li>
+  <li>Stephen Haberman &#8211; bug fixes</li>
+  <li>Walker Hamilton &#8211; bug fix</li>
+  <li>Mark Hamstra &#8211; scheduler improvements and fixes, build fixes</li>
+  <li>Damien Hardy &#8211; Debian build fix</li>
+  <li>Nathan Howell &#8211; sbt upgrade</li>
+  <li>Grace Huang &#8211; improvements to metrics code</li>
+  <li>Shane Huang &#8211; separation of admin and user scripts:</li>
+  <li>Prabeesh K &#8211; MQTT integration for Spark Streaming and code fix</li>
+  <li>Holden Karau &#8211; sbt build improvements and Java API extensions</li>
+  <li>KarthikTunga &#8211; bug fix</li>
+  <li>Grega Kespret &#8211; bug fix</li>
+  <li>Marek Kolodziej &#8211; optimized random number generator</li>
+  <li>Jey Kottalam &#8211; EC2 script improvements</li>
+  <li>Du Li &#8211; bug fixes</li>
+  <li>Haoyuan Li &#8211; tachyon support in EC2</li>
+  <li>LiGuoqiang &#8211; fixes to build and YARN integration</li>
+  <li>Raymond Liu &#8211; build improvement and various fixes for YARN support</li>
+  <li>George Loentiev &#8211; Maven build fixes</li>
+  <li>Akihiro Matsukawa &#8211; GraphX contributions</li>
+  <li>David McCauley &#8211; improvements to json endpoint</li>
+  <li>Mike &#8211; bug fixes</li>
+  <li>Fabrizio (Misto) Milo &#8211; bug fix</li>
+  <li>Mridul Muralidharan &#8211; speculation improvements, several bug fixes</li>
+  <li>Tor Myklebust &#8211; Python mllib bindings, instrumentation for task serailization</li>
+  <li>Sundeep Narravula &#8211; bug fix</li>
+  <li>Binh Nguyen &#8211; Java API improvements and version upgrades</li>
+  <li>Adam Novak &#8211; bug fix</li>
+  <li>Andrew Or &#8211; external sorting</li>
+  <li>Kay Ousterhout &#8211; several bug fixes and improvements to Spark scheduler</li>
+  <li>Sean Owen &#8211; style fixes</li>
+  <li>Nick Pentreath &#8211; ALS implicit feedback algorithm</li>
+  <li>Pillis &#8211; <code>Vector.random()</code> method</li>
+  <li>Imran Rashid &#8211; bug fix</li>
+  <li>Ahir Reddy &#8211; support for SIMR</li>
+  <li>Luca Rosellini &#8211; script loading for Scala shell</li>
+  <li>Josh Rosen &#8211; fixes, clean-up, and extensions to scala and Java API’s</li>
+  <li>Henry Saputra &#8211; style improvements and clean-up</li>
+  <li>Andre Schumacher &#8211; Python improvements and bug fixes</li>
+  <li>Jerry Shao &#8211; multi-user support, various fixes and improvements</li>
+  <li>Prashant Sharma &#8211; Scala 2.10 support, configuration system, several
smaller fixes</li>
+  <li>Shiyun &#8211; style fix</li>
+  <li>Wangda Tan &#8211; UI improvement and bug fixes</li>
+  <li>Matthew Taylor &#8211; bug fix</li>
+  <li>Jyun-Fan Tsai &#8211; documentation fix</li>
+  <li>Takuya Ueshin &#8211; bug fix</li>
+  <li>Shivaram Venkataraman &#8211; sbt build optimization, EC2 improvements, Java
and Python API</li>
+  <li>Jianping J Wang &#8211; GraphX contributions</li>
+  <li>Martin Weindel &#8211; build fix</li>
+  <li>Patrick Wendell &#8211; standalone driver submission, various fixes, release
manager</li>
+  <li>Neal Wiggins &#8211; bug fix</li>
+  <li>Andrew Xia &#8211; bug fixes and code cleanup</li>
+  <li>Reynold Xin &#8211; GraphX contributions, task killing, various fixes, improvements
and optimizations</li>
+  <li>Dong Yan &#8211; bug fix</li>
+  <li>Haitao Yao &#8211; bug fix</li>
+  <li>Xusen Yin &#8211; bug fix</li>
+  <li>Fengdong Yu &#8211; documentation fixes</li>
+  <li>Matei Zaharia &#8211; new configuration system, Python MLlib bindings, scheduler
improvements, various fixes and optimizations</li>
+  <li>Wu Zeming &#8211; bug fix</li>
+  <li>Nan Zhu &#8211; documentation improvements</li>
 </ul>
 
 <p><em>Thanks to everyone who contributed!</em></p>

Modified: spark/site/releases/spark-release-0-9-1.html
URL: http://svn.apache.org/viewvc/spark/site/releases/spark-release-0-9-1.html?rev=1600800&r1=1600799&r2=1600800&view=diff
==============================================================================
--- spark/site/releases/spark-release-0-9-1.html (original)
+++ spark/site/releases/spark-release-0-9-1.html Fri Jun  6 00:55:58 2014
@@ -235,7 +235,7 @@
   <li>Andrew Tulloch - Minor updates to MLLib</li>
   <li>Bijay Bisht - Fix for hadoop-client for Hadoop &lt; 1.0.1 and for bug in
Spark on Mesos + CDH4.5.0</li>
   <li>Bouke van der Bijl - Bug fix in Python depickling</li>
-  <li>Bryn Keller  - Support for HBase’s TableOutputFormat</li>
+  <li>Bryn Keller  - Support for HBase&#8217;s TableOutputFormat</li>
   <li>Chen Chao - Bug fix in spark-shell script, and improvements to streaming programming
guide</li>
   <li>Christian Lundgren - Support for C3 EC2 instance type</li>
   <li>Diana Carroll - Improvements to PySpark programming guide</li>
@@ -245,7 +245,7 @@
   <li>jianghan - Bug fixes in Java examples</li>
   <li>Josh Rosen - Bug fix in PySpark string serialization and exception handling</li>
   <li>Jyotiska NK  - Improvements to PySpark doc and examples</li>
-  <li>Kay Ousterhout - Multiple bug fixes in scheduler’s handling of task failures</li>
+  <li>Kay Ousterhout - Multiple bug fixes in scheduler&#8217;s handling of task
failures</li>
   <li>Kousuke Saruta - Use of https to access github</li>
   <li>Mark Grover  - Bug fix in distribution tar.gz</li>
   <li>Matei Zaharia - Bug fixes in handling of task failures due to NPE,  and cleaning
up of scheduler data structures </li>
@@ -258,10 +258,10 @@
   <li>Raymond Liu - Changed working directory in ZookeeperPersistenceEngine</li>
   <li>Reynold Xin  - Improvements to docs and test infrastructure</li>
   <li>Sandy Ryza - Multiple important Yarn bug fixes and improvements</li>
-  <li>Sean Owen - Bug fixes and improvements for MLLib’s ALS</li>
+  <li>Sean Owen - Bug fixes and improvements for MLLib&#8217;s ALS</li>
   <li>Shixiong Zhu - Fixed thread-unsafe use of SimpleDateFormat</li>
   <li>shiyun.wxm - UI bug fix</li>
-  <li>Stevo Slavić - Bug fix in window’s run-example script</li>
+  <li>Stevo Slavić - Bug fix in window&#8217;s run-example script</li>
   <li>Tathagata Das - Improvements to streaming docs</li>
   <li>Tom Graves - Bug fixes in YARN deployment modes</li>
   <li>Xiangrui Meng - Improvements to ALS and GLM, and MLLib programming guide</li>

Modified: spark/site/releases/spark-release-1-0-0.html
URL: http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-0-0.html?rev=1600800&r1=1600799&r2=1600800&view=diff
==============================================================================
--- spark/site/releases/spark-release-1-0-0.html (original)
+++ spark/site/releases/spark-release-1-0-0.html Fri Jun  6 00:55:58 2014
@@ -192,11 +192,11 @@
 <p>Spark 1.0 adds support for Java 8 <a href="http://docs.oracle.com/javase/tutorial/java/javaOO/lambdaexpressions.html">new
lambda syntax</a> in its Java bindings. Java 8 supports a concise syntax for writing
anonymous functions, similar to the closure syntax in Scala and Python. This change requires
small changes for users of the current Java API, which are noted in the documentation. Spark’s
Python API has been extended to support several new functions. We’ve also included several
stability improvements in the Python API, particularly for large datasets. PySpark now supports
running on YARN as well.</p>
 
 <h3 id="documentation">Documentation</h3>
-<p>Spark’s <a href="/docs/latest/programming-guide.html">programming guide</a>
has been significantly expanded to centrally cover all supported languages and discuss more
operators and aspects of the development life cycle. The <a href="/docs/latest/mllib-guide.html">MLlib
guide</a> has also been expanded with significantly more detail and examples for each
algorithm, while documents on configuration, YARN and Mesos have also been revamped.</p>
+<p>Spark&#8217;s <a href="/docs/latest/programming-guide.html">programming
guide</a> has been significantly expanded to centrally cover all supported languages
and discuss more operators and aspects of the development life cycle. The <a href="/docs/latest/mllib-guide.html">MLlib
guide</a> has also been expanded with significantly more detail and examples for each
algorithm, while documents on configuration, YARN and Mesos have also been revamped.</p>
 
 <h3 id="smaller-changes">Smaller Changes</h3>
 <ul>
-  <li>PySpark now works with more Python versions than before – Python 2.6+ instead
of 2.7+, and NumPy 1.4+ instead of 1.7+.</li>
+  <li>PySpark now works with more Python versions than before &#8211; Python 2.6+
instead of 2.7+, and NumPy 1.4+ instead of 1.7+.</li>
   <li>Spark has upgraded to Avro 1.7.6, adding support for Avro specific types.</li>
   <li>Internal instrumentation has been added to allow applications to monitor and
instrument Spark jobs.</li>
   <li>Support for off-heap storage in Tachyon has been added via a special build target.</li>
@@ -213,123 +213,123 @@
 <p>The following developers contributed to this release:</p>
 
 <ul>
-  <li>Aaron Davidson – packaging and deployment improvements, several bug fixes,
local[*] mode</li>
-  <li>Aaron Kimball – documentation improvements</li>
-  <li>Abhishek Kumar – Python configuration fixes</li>
-  <li>Ahir Reddy – PySpark build, fixes, and cancellation support</li>
-  <li>Allan Douglas R. de Oliveira – Improvements to spark-ec2 scripts</li>
-  <li>Andre Schumacher – Parquet support and optimizations</li>
-  <li>Andrew Ash – Mesos documentation and other doc improvements, bug fixes</li>
-  <li>Andrew Or – history server (lead), garbage collection (lead), spark-submit,
PySpark and YARN improvements</li>
-  <li>Andrew Tulloch – MLlib contributions and code clean-up</li>
-  <li>Andy Konwinski – documentation fix</li>
-  <li>Anita Tailor – Cassandra example</li>
-  <li>Ankur Dave – GraphX (lead) optimizations, documentation, and usability</li>
-  <li>Archer Shao – bug fixes</li>
-  <li>Arun Ramakrishnan – improved random sampling</li>
-  <li>Baishuo – test improvements</li>
-  <li>Bernardo Gomez Palacio – spark-shell improvements and Mesos updates</li>
-  <li>Bharath Bhushan – bug fix</li>
-  <li>Bijay Bisht – bug fixes</li>
-  <li>Binh Nguyen – dependency fix</li>
-  <li>Bouke van der Bijl – fixes for PySpark on Mesos and other Mesos fixes</li>
-  <li>Bryn Keller – improvement to HBase support and unit tests</li>
-  <li>Chen Chao – documentation, bug fix, and code clean-up</li>
-  <li>Cheng Hao – performance and feature improvements in Spark SQL</li>
-  <li>Cheng Lian – column storage and other improvements in Spark SQL</li>
-  <li>Christian Lundgren – improvement to spark-ec2 scripts</li>
-  <li>DB Tsai – L-BGFS optimizer in MLlib, MLlib documentation and fixes</li>
-  <li>Dan McClary – Improvement to stats counter</li>
-  <li>Daniel Darabos – GraphX performance improvement</li>
-  <li>Davis Shepherd – bug fix</li>
-  <li>Diana Carroll – documentation and bug fix</li>
-  <li>Egor Pakhomov – local iterator for RDD’s</li>
-  <li>Emtiaz Ahmed – bug fix</li>
-  <li>Erik Selin – bug fix</li>
-  <li>Ethan Jewett – documentation improvement</li>
-  <li>Evan Chan – automatic clean-up of application data</li>
-  <li>Evan Sparks – MLlib optimizations and doc improvement</li>
-  <li>Frank Dai – code clean-up in MLlib</li>
-  <li>Guoquiang Li – build improvements and several bug fixes</li>
-  <li>Ghidireac – bug fix</li>
-  <li>Haoyuan Li – Tachyon storage level for RDD’s</li>
-  <li>Harvey Feng – spark-ec2 update</li>
-  <li>Henry Saputra – code clean-up</li>
-  <li>Henry Cook – Spark SQL improvements</li>
-  <li>Holden Karau – cross validation in MLlib, Python and core engine improvements</li>
-  <li>Ivan Wick – Mesos bug fix</li>
-  <li>Jey Kottalam – sbt build improvement</li>
-  <li>Jerry Shao – Spark metrics and Spark SQL improvements</li>
-  <li>Jiacheng Guo – bug fix</li>
-  <li>Jianghan – bug fix</li>
-  <li>Jianping J Wang – JBLAS support in MLlib</li>
-  <li>Joseph E. Gonzalez – GraphX improvements, fixes, and documentation</li>
-  <li>Josh Rosen – PySpark improvements and bug fixes</li>
-  <li>Jyotiska NK – documentation, test improvements, and bug fix</li>
-  <li>Kan Zhang – bug fixes in Spark core, SQL, and PySpark</li>
-  <li>Kay Ousterhout – bug fixes and code refactoring in scheduler</li>
-  <li>Kelvin Chu – automatic clean-up of application data</li>
-  <li>Kevin Mader – example fix</li>
-  <li>Koert Kuipers – code visibility fix</li>
-  <li>Kousuke Saruta – documentation and build fixes</li>
-  <li>Kyle Ellrott – improved memory usage for DISK_ONLY persistence</li>
-  <li>Larva Boy – approximate counts in Spark SQL</li>
-  <li>Madhu Siddalingaiah – ec2 fixes</li>
-  <li>Manish Amde – decision trees in MLlib</li>
-  <li>Marcelo Vanzin – improvements and fixes to YARN support, dependency clean-up</li>
-  <li>Mark Grover – build fixes</li>
-  <li>Mark Hamstra – build and dependency improvements, scheduler bug fixes</li>
-  <li>Margin Jaggi – MLlib documentation improvements</li>
-  <li>Matei Zaharia – Python versions of several MLlib algorithms, spark-submit
improvements, bug fixes, and documentation improvements</li>
-  <li>Michael Armbrust – Spark SQL (lead), including schema support for RDD’s,
catalyst optimizer, and Hive support</li>
-  <li>Mridul Muralidharan – code visibility changes and bug fixes</li>
-  <li>Nan Zhu – bug and stability fixes, code clean-up, documentation, and new
features</li>
-  <li>Neville Li – bug fix</li>
-  <li>Nick Lanham – Tachyon bundling in distribution script</li>
-  <li>Nirmal Reddy – code clean-up</li>
-  <li>OuYang Jin – local mode and json improvements</li>
-  <li>Patrick Wendell – release manager, build improvements, bug fixes, and code
clean-up</li>
-  <li>Petko Nikolov – new utility functions</li>
-  <li>Prabeesh K – typo fix</li>
-  <li>Prabin Banka – new PySpark API’s</li>
-  <li>Prashant Sharma – PySpark improvements, Java 8 lambda support, and build
improvements</li>
-  <li>Punya Biswal – Java API improvements</li>
-  <li>Qiuzhuang Lian – bug fixes</li>
-  <li>Rahul Singhal – build improvements, bug fixes</li>
-  <li>Raymond Liu – YARN build fixes and UI improvements</li>
-  <li>Reynold Xin – bug fixes, internal changes, Spark SQL improvements, build
fixes, and style improvements</li>
-  <li>Reza Zadeh – SVD implementation in MLlib and other MLlib contributions</li>
-  <li>Roman Pastukhov – clean-up of broadcast files</li>
-  <li>Rong Gu – Tachyon storage level for RDD’s</li>
-  <li>Sandeep Sing – several bug fixes, MLLib improvements and fixes to Spark
examples</li>
-  <li>Sandy Ryza – spark-submit script and several YARN improvements</li>
-  <li>Saurabh Rawat  – Java API improvements</li>
-  <li>Sean Owen – several build improvements, code clean-up, and MLlib fixes</li>
-  <li>Semih Salihoglu – GraphX improvements</li>
-  <li>Shaocun Tian – bug fix in MLlib</li>
-  <li>Shivaram Venkataraman – bug fixes</li>
-  <li>Shixiong Zhu – code style and correctness fixes</li>
-  <li>Shiyun Wxm – typo fix</li>
-  <li>Stevo Slavic – bug fix</li>
-  <li>Sumedh Mungee – documentation fix</li>
-  <li>Sundeep Narravula – “cancel” button in Spark UI</li>
-  <li>Takayu Ueshin – bug fixes and improvements to Spark SQL</li>
-  <li>Tathagata Das – web UI and other improvements to Spark Streaming (lead),
bug fixes, state clean-up, and release manager</li>
-  <li>Timothy Chen – Spark SQL improvements</li>
-  <li>Ted Malaska – improved Flume support</li>
-  <li>Tom Graves – Hadoop security integration (lead) and YARN support</li>
-  <li>Tianshuo Deng – Bug fix</li>
-  <li>Tor Myklebust – improvements to ALS</li>
-  <li>Wangfei – Spark SQL docs</li>
-  <li>Wang Tao – code clean-up</li>
-  <li>William Bendon – JSON support changes and bug fixes</li>
-  <li>Xiangrui Meng – several improvements to MLlib (lead)</li>
-  <li>Xuan Nguyen – build fix</li>
-  <li>Xusen Yin – MLlib contributions and bug fix</li>
-  <li>Ye Xianjin – test fixes</li>
-  <li>Yinan Li – addFile improvement</li>
-  <li>Yin Hua – Spark SQL improvements</li>
-  <li>Zheng Peng – bug fixes</li>
+  <li>Aaron Davidson &#8211; packaging and deployment improvements, several bug
fixes, local[*] mode</li>
+  <li>Aaron Kimball &#8211; documentation improvements</li>
+  <li>Abhishek Kumar &#8211; Python configuration fixes</li>
+  <li>Ahir Reddy &#8211; PySpark build, fixes, and cancellation support</li>
+  <li>Allan Douglas R. de Oliveira &#8211; Improvements to spark-ec2 scripts</li>
+  <li>Andre Schumacher &#8211; Parquet support and optimizations</li>
+  <li>Andrew Ash &#8211; Mesos documentation and other doc improvements, bug fixes</li>
+  <li>Andrew Or &#8211; history server (lead), garbage collection (lead), spark-submit,
PySpark and YARN improvements</li>
+  <li>Andrew Tulloch &#8211; MLlib contributions and code clean-up</li>
+  <li>Andy Konwinski &#8211; documentation fix</li>
+  <li>Anita Tailor &#8211; Cassandra example</li>
+  <li>Ankur Dave &#8211; GraphX (lead) optimizations, documentation, and usability</li>
+  <li>Archer Shao &#8211; bug fixes</li>
+  <li>Arun Ramakrishnan &#8211; improved random sampling</li>
+  <li>Baishuo &#8211; test improvements</li>
+  <li>Bernardo Gomez Palacio &#8211; spark-shell improvements and Mesos updates</li>
+  <li>Bharath Bhushan &#8211; bug fix</li>
+  <li>Bijay Bisht &#8211; bug fixes</li>
+  <li>Binh Nguyen &#8211; dependency fix</li>
+  <li>Bouke van der Bijl &#8211; fixes for PySpark on Mesos and other Mesos fixes</li>
+  <li>Bryn Keller &#8211; improvement to HBase support and unit tests</li>
+  <li>Chen Chao &#8211; documentation, bug fix, and code clean-up</li>
+  <li>Cheng Hao &#8211; performance and feature improvements in Spark SQL</li>
+  <li>Cheng Lian &#8211; column storage and other improvements in Spark SQL</li>
+  <li>Christian Lundgren &#8211; improvement to spark-ec2 scripts</li>
+  <li>DB Tsai &#8211; L-BGFS optimizer in MLlib, MLlib documentation and fixes</li>
+  <li>Dan McClary &#8211; Improvement to stats counter</li>
+  <li>Daniel Darabos &#8211; GraphX performance improvement</li>
+  <li>Davis Shepherd &#8211; bug fix</li>
+  <li>Diana Carroll &#8211; documentation and bug fix</li>
+  <li>Egor Pakhomov &#8211; local iterator for RDD’s</li>
+  <li>Emtiaz Ahmed &#8211; bug fix</li>
+  <li>Erik Selin &#8211; bug fix</li>
+  <li>Ethan Jewett &#8211; documentation improvement</li>
+  <li>Evan Chan &#8211; automatic clean-up of application data</li>
+  <li>Evan Sparks &#8211; MLlib optimizations and doc improvement</li>
+  <li>Frank Dai &#8211; code clean-up in MLlib</li>
+  <li>Guoquiang Li &#8211; build improvements and several bug fixes</li>
+  <li>Ghidireac &#8211; bug fix</li>
+  <li>Haoyuan Li &#8211; Tachyon storage level for RDD’s</li>
+  <li>Harvey Feng &#8211; spark-ec2 update</li>
+  <li>Henry Saputra &#8211; code clean-up</li>
+  <li>Henry Cook &#8211; Spark SQL improvements</li>
+  <li>Holden Karau &#8211; cross validation in MLlib, Python and core engine improvements</li>
+  <li>Ivan Wick &#8211; Mesos bug fix</li>
+  <li>Jey Kottalam &#8211; sbt build improvement</li>
+  <li>Jerry Shao &#8211; Spark metrics and Spark SQL improvements</li>
+  <li>Jiacheng Guo &#8211; bug fix</li>
+  <li>Jianghan &#8211; bug fix</li>
+  <li>Jianping J Wang &#8211; JBLAS support in MLlib</li>
+  <li>Joseph E. Gonzalez &#8211; GraphX improvements, fixes, and documentation</li>
+  <li>Josh Rosen &#8211; PySpark improvements and bug fixes</li>
+  <li>Jyotiska NK &#8211; documentation, test improvements, and bug fix</li>
+  <li>Kan Zhang &#8211; bug fixes in Spark core, SQL, and PySpark</li>
+  <li>Kay Ousterhout &#8211; bug fixes and code refactoring in scheduler</li>
+  <li>Kelvin Chu &#8211; automatic clean-up of application data</li>
+  <li>Kevin Mader &#8211; example fix</li>
+  <li>Koert Kuipers &#8211; code visibility fix</li>
+  <li>Kousuke Saruta &#8211; documentation and build fixes</li>
+  <li>Kyle Ellrott &#8211; improved memory usage for DISK_ONLY persistence</li>
+  <li>Larva Boy &#8211; approximate counts in Spark SQL</li>
+  <li>Madhu Siddalingaiah &#8211; ec2 fixes</li>
+  <li>Manish Amde &#8211; decision trees in MLlib</li>
+  <li>Marcelo Vanzin &#8211; improvements and fixes to YARN support, dependency
clean-up</li>
+  <li>Mark Grover &#8211; build fixes</li>
+  <li>Mark Hamstra &#8211; build and dependency improvements, scheduler bug fixes</li>
+  <li>Margin Jaggi &#8211; MLlib documentation improvements</li>
+  <li>Matei Zaharia &#8211; Python versions of several MLlib algorithms, spark-submit
improvements, bug fixes, and documentation improvements</li>
+  <li>Michael Armbrust &#8211; Spark SQL (lead), including schema support for RDD’s,
catalyst optimizer, and Hive support</li>
+  <li>Mridul Muralidharan &#8211; code visibility changes and bug fixes</li>
+  <li>Nan Zhu &#8211; bug and stability fixes, code clean-up, documentation, and
new features</li>
+  <li>Neville Li &#8211; bug fix</li>
+  <li>Nick Lanham &#8211; Tachyon bundling in distribution script</li>
+  <li>Nirmal Reddy &#8211; code clean-up</li>
+  <li>OuYang Jin &#8211; local mode and json improvements</li>
+  <li>Patrick Wendell &#8211; release manager, build improvements, bug fixes, and
code clean-up</li>
+  <li>Petko Nikolov &#8211; new utility functions</li>
+  <li>Prabeesh K &#8211; typo fix</li>
+  <li>Prabin Banka &#8211; new PySpark API’s</li>
+  <li>Prashant Sharma &#8211; PySpark improvements, Java 8 lambda support, and
build improvements</li>
+  <li>Punya Biswal &#8211; Java API improvements</li>
+  <li>Qiuzhuang Lian &#8211; bug fixes</li>
+  <li>Rahul Singhal &#8211; build improvements, bug fixes</li>
+  <li>Raymond Liu &#8211; YARN build fixes and UI improvements</li>
+  <li>Reynold Xin &#8211; bug fixes, internal changes, Spark SQL improvements,
build fixes, and style improvements</li>
+  <li>Reza Zadeh &#8211; SVD implementation in MLlib and other MLlib contributions</li>
+  <li>Roman Pastukhov &#8211; clean-up of broadcast files</li>
+  <li>Rong Gu &#8211; Tachyon storage level for RDD’s</li>
+  <li>Sandeep Sing &#8211; several bug fixes, MLLib improvements and fixes to Spark
examples</li>
+  <li>Sandy Ryza &#8211; spark-submit script and several YARN improvements</li>
+  <li>Saurabh Rawat  &#8211; Java API improvements</li>
+  <li>Sean Owen &#8211; several build improvements, code clean-up, and MLlib fixes</li>
+  <li>Semih Salihoglu &#8211; GraphX improvements</li>
+  <li>Shaocun Tian &#8211; bug fix in MLlib</li>
+  <li>Shivaram Venkataraman &#8211; bug fixes</li>
+  <li>Shixiong Zhu &#8211; code style and correctness fixes</li>
+  <li>Shiyun Wxm &#8211; typo fix</li>
+  <li>Stevo Slavic &#8211; bug fix</li>
+  <li>Sumedh Mungee &#8211; documentation fix</li>
+  <li>Sundeep Narravula &#8211; “cancel” button in Spark UI</li>
+  <li>Takayu Ueshin &#8211; bug fixes and improvements to Spark SQL</li>
+  <li>Tathagata Das &#8211; web UI and other improvements to Spark Streaming (lead),
bug fixes, state clean-up, and release manager</li>
+  <li>Timothy Chen &#8211; Spark SQL improvements</li>
+  <li>Ted Malaska &#8211; improved Flume support</li>
+  <li>Tom Graves &#8211; Hadoop security integration (lead) and YARN support</li>
+  <li>Tianshuo Deng &#8211; Bug fix</li>
+  <li>Tor Myklebust &#8211; improvements to ALS</li>
+  <li>Wangfei &#8211; Spark SQL docs</li>
+  <li>Wang Tao &#8211; code clean-up</li>
+  <li>William Bendon &#8211; JSON support changes and bug fixes</li>
+  <li>Xiangrui Meng &#8211; several improvements to MLlib (lead)</li>
+  <li>Xuan Nguyen &#8211; build fix</li>
+  <li>Xusen Yin &#8211; MLlib contributions and bug fix</li>
+  <li>Ye Xianjin &#8211; test fixes</li>
+  <li>Yinan Li &#8211; addFile improvement</li>
+  <li>Yin Hua &#8211; Spark SQL improvements</li>
+  <li>Zheng Peng &#8211; bug fixes</li>
 </ul>
 
 <p><em>Thanks to everyone who contributed!</em></p>



Mime
View raw message