spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sro...@apache.org
Subject [6/6] spark-website git commit: Port wiki page Committers to committers.html, Contributing to Spark and Code Style Guide to contributing.html, Third Party Projects and Additional Language Bindings to third-party-projects.html, Powered By to powered-by.ht
Date Mon, 21 Nov 2016 21:01:26 GMT
Port wiki page Committers to committers.html, Contributing to Spark and Code Style Guide to contributing.html, Third Party Projects and Additional Language Bindings to third-party-projects.html, Powered By to powered-by.html


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/0744e8fd
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/0744e8fd
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/0744e8fd

Branch: refs/heads/asf-site
Commit: 0744e8fdd9f954a6552c968be50604241097dbbc
Parents: 46fb910
Author: Sean Owen <sowen@cloudera.com>
Authored: Sat Nov 19 12:35:02 2016 +0000
Committer: Sean Owen <sowen@cloudera.com>
Committed: Mon Nov 21 20:57:42 2016 +0000

----------------------------------------------------------------------
 _layouts/global.html                            |  13 +-
 committers.md                                   | 167 ++++
 community.md                                    |   2 +-
 contributing.md                                 | 523 +++++++++++++
 documentation.md                                |   2 +-
 faq.md                                          |   4 +-
 graphx/index.md                                 |   2 +-
 index.md                                        |   9 +-
 mllib/index.md                                  |   2 +-
 ...-05-spark-user-survey-and-powered-by-page.md |   2 +-
 powered-by.md                                   | 239 ++++++
 site/committers.html                            | 518 +++++++++++++
 site/community.html                             |  15 +-
 site/contributing.html                          | 771 +++++++++++++++++++
 site/documentation.html                         |  15 +-
 site/downloads.html                             |  13 +-
 site/examples.html                              |  13 +-
 site/faq.html                                   |  17 +-
 site/graphx/index.html                          |  15 +-
 site/index.html                                 |  22 +-
 site/mailing-lists.html                         |  13 +-
 site/mllib/index.html                           |  15 +-
 site/news/amp-camp-2013-registration-ope.html   |  13 +-
 .../news/announcing-the-first-spark-summit.html |  13 +-
 .../news/fourth-spark-screencast-published.html |  13 +-
 site/news/index.html                            |  13 +-
 site/news/nsdi-paper.html                       |  13 +-
 site/news/one-month-to-spark-summit-2015.html   |  13 +-
 .../proposals-open-for-spark-summit-east.html   |  13 +-
 ...registration-open-for-spark-summit-east.html |  13 +-
 .../news/run-spark-and-shark-on-amazon-emr.html |  13 +-
 site/news/spark-0-6-1-and-0-5-2-released.html   |  13 +-
 site/news/spark-0-6-2-released.html             |  13 +-
 site/news/spark-0-7-0-released.html             |  13 +-
 site/news/spark-0-7-2-released.html             |  13 +-
 site/news/spark-0-7-3-released.html             |  13 +-
 site/news/spark-0-8-0-released.html             |  13 +-
 site/news/spark-0-8-1-released.html             |  13 +-
 site/news/spark-0-9-0-released.html             |  13 +-
 site/news/spark-0-9-1-released.html             |  13 +-
 site/news/spark-0-9-2-released.html             |  13 +-
 site/news/spark-1-0-0-released.html             |  13 +-
 site/news/spark-1-0-1-released.html             |  13 +-
 site/news/spark-1-0-2-released.html             |  13 +-
 site/news/spark-1-1-0-released.html             |  13 +-
 site/news/spark-1-1-1-released.html             |  13 +-
 site/news/spark-1-2-0-released.html             |  13 +-
 site/news/spark-1-2-1-released.html             |  13 +-
 site/news/spark-1-2-2-released.html             |  13 +-
 site/news/spark-1-3-0-released.html             |  13 +-
 site/news/spark-1-4-0-released.html             |  13 +-
 site/news/spark-1-4-1-released.html             |  13 +-
 site/news/spark-1-5-0-released.html             |  13 +-
 site/news/spark-1-5-1-released.html             |  13 +-
 site/news/spark-1-5-2-released.html             |  13 +-
 site/news/spark-1-6-0-released.html             |  13 +-
 site/news/spark-1-6-1-released.html             |  13 +-
 site/news/spark-1-6-2-released.html             |  13 +-
 site/news/spark-1-6-3-released.html             |  13 +-
 site/news/spark-2-0-0-released.html             |  13 +-
 site/news/spark-2-0-1-released.html             |  13 +-
 site/news/spark-2-0-2-released.html             |  13 +-
 site/news/spark-2.0.0-preview.html              |  13 +-
 .../spark-accepted-into-apache-incubator.html   |  13 +-
 site/news/spark-and-shark-in-the-news.html      |  13 +-
 site/news/spark-becomes-tlp.html                |  13 +-
 site/news/spark-featured-in-wired.html          |  13 +-
 .../spark-mailing-lists-moving-to-apache.html   |  13 +-
 site/news/spark-meetups.html                    |  13 +-
 site/news/spark-screencasts-published.html      |  13 +-
 site/news/spark-summit-2013-is-a-wrap.html      |  13 +-
 site/news/spark-summit-2014-videos-posted.html  |  13 +-
 site/news/spark-summit-2015-videos-posted.html  |  13 +-
 site/news/spark-summit-agenda-posted.html       |  13 +-
 .../spark-summit-east-2015-videos-posted.html   |  13 +-
 .../spark-summit-east-2016-cfp-closing.html     |  13 +-
 site/news/spark-summit-east-agenda-posted.html  |  13 +-
 .../news/spark-summit-europe-agenda-posted.html |  13 +-
 site/news/spark-summit-europe.html              |  13 +-
 .../spark-summit-june-2016-agenda-posted.html   |  13 +-
 site/news/spark-tips-from-quantifind.html       |  13 +-
 .../spark-user-survey-and-powered-by-page.html  |  15 +-
 site/news/spark-version-0-6-0-released.html     |  13 +-
 .../spark-wins-cloudsort-100tb-benchmark.html   |  13 +-
 ...-wins-daytona-gray-sort-100tb-benchmark.html |  13 +-
 .../strata-exercises-now-available-online.html  |  13 +-
 .../news/submit-talks-to-spark-summit-2014.html |  13 +-
 .../news/submit-talks-to-spark-summit-2016.html |  13 +-
 .../submit-talks-to-spark-summit-east-2016.html |  13 +-
 .../submit-talks-to-spark-summit-eu-2016.html   |  13 +-
 site/news/two-weeks-to-spark-summit-2014.html   |  13 +-
 ...deo-from-first-spark-development-meetup.html |  13 +-
 site/powered-by.html                            | 563 ++++++++++++++
 site/releases/spark-release-0-3.html            |  13 +-
 site/releases/spark-release-0-5-0.html          |  13 +-
 site/releases/spark-release-0-5-1.html          |  13 +-
 site/releases/spark-release-0-5-2.html          |  13 +-
 site/releases/spark-release-0-6-0.html          |  13 +-
 site/releases/spark-release-0-6-1.html          |  13 +-
 site/releases/spark-release-0-6-2.html          |  13 +-
 site/releases/spark-release-0-7-0.html          |  13 +-
 site/releases/spark-release-0-7-2.html          |  13 +-
 site/releases/spark-release-0-7-3.html          |  13 +-
 site/releases/spark-release-0-8-0.html          |  13 +-
 site/releases/spark-release-0-8-1.html          |  13 +-
 site/releases/spark-release-0-9-0.html          |  13 +-
 site/releases/spark-release-0-9-1.html          |  13 +-
 site/releases/spark-release-0-9-2.html          |  13 +-
 site/releases/spark-release-1-0-0.html          |  13 +-
 site/releases/spark-release-1-0-1.html          |  13 +-
 site/releases/spark-release-1-0-2.html          |  13 +-
 site/releases/spark-release-1-1-0.html          |  13 +-
 site/releases/spark-release-1-1-1.html          |  13 +-
 site/releases/spark-release-1-2-0.html          |  13 +-
 site/releases/spark-release-1-2-1.html          |  13 +-
 site/releases/spark-release-1-2-2.html          |  13 +-
 site/releases/spark-release-1-3-0.html          |  13 +-
 site/releases/spark-release-1-3-1.html          |  13 +-
 site/releases/spark-release-1-4-0.html          |  13 +-
 site/releases/spark-release-1-4-1.html          |  13 +-
 site/releases/spark-release-1-5-0.html          |  13 +-
 site/releases/spark-release-1-5-1.html          |  13 +-
 site/releases/spark-release-1-5-2.html          |  13 +-
 site/releases/spark-release-1-6-0.html          |  13 +-
 site/releases/spark-release-1-6-1.html          |  13 +-
 site/releases/spark-release-1-6-2.html          |  13 +-
 site/releases/spark-release-1-6-3.html          |  13 +-
 site/releases/spark-release-2-0-0.html          |  13 +-
 site/releases/spark-release-2-0-1.html          |  13 +-
 site/releases/spark-release-2-0-2.html          |  13 +-
 site/research.html                              |  13 +-
 site/screencasts/1-first-steps-with-spark.html  |  13 +-
 .../2-spark-documentation-overview.html         |  13 +-
 .../3-transformations-and-caching.html          |  13 +-
 .../4-a-standalone-job-in-spark.html            |  13 +-
 site/screencasts/index.html                     |  13 +-
 site/sitemap.xml                                |  30 +-
 site/sql/index.html                             |  15 +-
 site/streaming/index.html                       |  15 +-
 site/third-party-projects.html                  | 287 +++++++
 site/trademarks.html                            |  13 +-
 sql/index.md                                    |   2 +-
 streaming/index.md                              |   2 +-
 third-party-projects.md                         |  84 ++
 144 files changed, 4081 insertions(+), 793 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/_layouts/global.html
----------------------------------------------------------------------
diff --git a/_layouts/global.html b/_layouts/global.html
index 6f02c16..662fb86 100644
--- a/_layouts/global.html
+++ b/_layouts/global.html
@@ -113,7 +113,7 @@
           <li><a href="{{site.baseurl}}/mllib/">MLlib (machine learning)</a></li>
           <li><a href="{{site.baseurl}}/graphx/">GraphX (graph)</a></li>
           <li class="divider"></li>
-          <li><a href="https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects">Third-Party Packages</a></li>
+          <li><a href="{{site.baseurl}}/third-party-projects.html">Third-Party Projects</a></li>
         </ul>
       </li>
       <li class="dropdown">
@@ -131,12 +131,13 @@
           Community <b class="caret"></b>
         </a>
         <ul class="dropdown-menu">
-          <li><a href="{{site.baseurl}}/community.html">Mailing Lists</a></li>
+          <li><a href="{{site.baseurl}}/community.html#mailing-lists">Mailing Lists</a></li>
+          <li><a href="{{site.baseurl}}/contributing.html">Contributing to Spark</a></li>
+          <li><a href="https://issues.apache.org/jira/browse/SPARK">Issue Tracker</a></li>
           <li><a href="{{site.baseurl}}/community.html#events">Events and Meetups</a></li>
           <li><a href="{{site.baseurl}}/community.html#history">Project History</a></li>
-          <li><a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">Powered By</a></li>
-          <li><a href="https://cwiki.apache.org/confluence/display/SPARK/Committers">Project Committers</a></li>
-          <li><a href="https://issues.apache.org/jira/browse/SPARK">Issue Tracker</a></li>
+          <li><a href="{{site.baseurl}}/powered-by.html">Powered By</a></li>
+          <li><a href="{{site.baseurl}}/committers.html">Project Committers</a></li>
         </ul>
       </li>
       <li><a href="{{site.baseurl}}/faq.html">FAQ</a></li>
@@ -184,7 +185,7 @@
         <li><a href="{{site.baseurl}}/mllib/">MLlib (machine learning)</a></li>
         <li><a href="{{site.baseurl}}/graphx/">GraphX (graph)</a></li>
       </ul>
-      <a href="https://cwiki.apache.org/confluence/display/SPARK/Third+Party+Projects">Third-Party Packages</a>
+      <a href="{{site.baseurl}}/third-party-projects.html">Third-Party Projects</a>
     </div>
   </div>
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/committers.md
----------------------------------------------------------------------
diff --git a/committers.md b/committers.md
new file mode 100644
index 0000000..03defa6
--- /dev/null
+++ b/committers.md
@@ -0,0 +1,167 @@
+---
+layout: global
+title: Committers
+type: "page singular"
+navigation:
+  weight: 5
+  show: true
+---
+<h2>Current Committers</h2>
+
+|Name|Organization|
+|----|------------|
+|Michael Armbrust|Databricks|
+|Joseph Bradley|Databricks|
+|Felix Cheung|Automattic|
+|Mosharaf Chowdhury|University of Michigan, Ann Arbor|
+|Jason Dai|Intel|
+|Tathagata Das|Databricks|
+|Ankur Dave|UC Berkeley|
+|Aaron Davidson|Databricks|
+|Thomas Dudziak|Facebook|
+|Robert Evans|Yahoo!|
+|Wenchen Fan|Databricks|
+|Joseph Gonzalez|UC Berkeley|
+|Thomas Graves|Yahoo!|
+|Stephen Haberman|Bizo|
+|Mark Hamstra|ClearStory Data|
+|Herman van Hovell|QuestTec B.V.|
+|Yin Huai|Databricks|
+|Shane Huang|Intel|
+|Andy Konwinski|Databricks|
+|Ryan LeCompte|Quantifind|
+|Haoyuan Li|Alluxio, UC Berkeley|
+|Xiao Li|IBM|
+|Davies Liu|Databricks|
+|Cheng Lian|Databricks|
+|Yanbo Liang|Hortonworks|
+|Sean McNamara|Webtrends|
+|Xiangrui Meng|Databricks|
+|Mridul Muralidharam|Hortonworks|
+|Andrew Or|Princeton University|
+|Kay Ousterhout|UC Berkeley|
+|Sean Owen|Cloudera|
+|Nick Pentreath|IBM|
+|Imran Rashid|Cloudera|
+|Charles Reiss|UC Berkeley|
+|Josh Rosen|Databricks|
+|Sandy Ryza|Clover Health|
+|Kousuke Saruta|NTT Data|
+|Prashant Sharma|IBM|
+|Ram Sriharsha|Databricks|
+|DB Tsai|Netflix|
+|Marcelo Vanzin|Cloudera|
+|Shivaram Venkataraman|UC Berkeley|
+|Patrick Wendell|Databricks|
+|Andrew Xia|Alibaba|
+|Reynold Xin|Databricks|
+|Matei Zaharia|Databricks, Stanford|
+|Shixiong Zhu|Databricks|
+
+<h3>Becoming a Committer</h3>
+
+To get started contributing to Spark, learn 
+<a href="{{site.baseurl}}/contributing.html">how to contribute</a> – 
+anyone can submit patches, documentation and examples to the project.
+
+The PMC regularly adds new committers from the active contributors, based on their contributions 
+to Spark. The qualifications for new committers include:
+
+1. Sustained contributions to Spark: Committers should have a history of major contributions to 
+Spark. An ideal committer will have contributed broadly throughout the project, and have 
+contributed at least one major component where they have taken an "ownership" role. An ownership 
+role means that existing contributors feel that they should run patches for this component by 
+this person.
+2. Quality of contributions: Committers more than any other community member should submit simple, 
+well-tested, and well-designed patches. In addition, they should show sufficient expertise to be 
+able to review patches, including making sure they fit within Spark's engineering practices 
+(testability, documentation, API stability, code style, etc). The committership is collectively 
+responsible for the software quality and maintainability of Spark.
+3. Community involvement: Committers should have a constructive and friendly attitude in all 
+community interactions. They should also be active on the dev and user list and help mentor 
+newer contributors and users. In design discussions, committers should maintain a professional 
+and diplomatic approach, even in the face of disagreement.
+
+The type and level of contributions considered may vary by project area -- for example, we 
+greatly encourage contributors who want to work on mainly the documentation, or mainly on 
+platform support for specific OSes, storage systems, etc.
+
+<h3>Review Process</h3>
+
+All contributions should be reviewed before merging as described in 
+<a href="{{site.baseurl}}/contributing.html">Contributing to Spark</a>. 
+In particular, if you are working on an area of the codebase you are unfamiliar with, look at the 
+Git history for that code to see who reviewed patches before. You can do this using 
+`git log --format=full <filename>`, by examining the "Commit" field to see who committed each patch.
+
+<h3>How to Merge a Pull Request</h3>
+
+Changes pushed to the master branch on Apache cannot be removed; that is, we can't force-push to 
+it. So please don't add any test commits or anything like that, only real patches.
+
+All merges should be done using the 
+[dev/merge_spark_pr.py](https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py) 
+script, which squashes the pull request's changes into one commit. To use this script, you 
+will need to add a git remote called "apache" at https://git-wip-us.apache.org/repos/asf/spark.git, 
+as well as one called "apache-github" at `git://github.com/apache/spark`. For the `apache` repo, 
+you can authenticate using your ASF username and password. Ask Patrick if you have trouble with 
+this or want help doing your first merge.
+
+The script is fairly self explanatory and walks you through steps and options interactively.
+
+If you want to amend a commit before merging – which should be used for trivial touch-ups – 
+then simply let the script wait at the point where it asks you if you want to push to Apache. 
+Then, in a separate window, modify the code and push a commit. Run `git rebase -i HEAD~2` and 
+"squash" your new commit. Edit the commit message just after to remove your commit message. 
+You can verify the result is one change with `git log`. Then resume the script in the other window.
+
+Also, please remember to set Assignee on JIRAs where applicable when they are resolved. The script 
+can't do this automatically.
+
+<!--
+<h3>Minimize use of MINOR, BUILD, and HOTFIX with no JIRA</h3>
+
+From pwendell at https://www.mail-archive.com/dev@spark.apache.org/msg09565.html:
+It would be great if people could create JIRA's for any and all merged pull requests. The reason is 
+that when patches get reverted due to build breaks or other issues, it is very difficult to keep 
+track of what is going on if there is no JIRA. 
+Here is a list of 5 patches we had to revert recently that didn't include a JIRA:
+    Revert "[MINOR] [BUILD] Use custom temp directory during build."
+    Revert "[SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior"
+    Revert "[BUILD] Always run SQL tests in master build."
+    Revert "[MINOR] [CORE] Warn users who try to cache RDDs with dynamic allocation on."
+    Revert "[HOT FIX] [YARN] Check whether `/lib` exists before listing its files"
+
+The cost overhead of creating a JIRA relative to other aspects of development is very small. 
+If it's really a documentation change or something small, that's okay.
+
+But anything affecting the build, packaging, etc. These all need to have a JIRA to ensure that 
+follow-up can be well communicated to all Spark developers.
+-->
+
+<h3>Policy on Backporting Bug Fixes</h3>
+
+From <a href="https://www.mail-archive.com/dev@spark.apache.org/msg10284.html">`pwendell`</a>:
+
+The trade off when backporting is you get to deliver the fix to people running older versions 
+(great!), but you risk introducing new or even worse bugs in maintenance releases (bad!). 
+The decision point is when you have a bug fix and it's not clear whether it is worth backporting.
+
+I think the following facets are important to consider:
+- Backports are an extremely valuable service to the community and should be considered for 
+any bug fix.
+- Introducing a new bug in a maintenance release must be avoided at all costs. It over time would 
+erode confidence in our release process.
+- Distributions or advanced users can always backport risky patches on their own, if they see fit.
+
+For me, the consequence of these is that we should backport in the following situations:
+- Both the bug and the fix are well understood and isolated. Code being modified is well tested.
+- The bug being addressed is high priority to the community.
+- The backported fix does not vary widely from the master branch fix.
+
+We tend to avoid backports in the converse situations:
+- The bug or fix are not well understood. For instance, it relates to interactions between complex 
+components or third party libraries (e.g. Hadoop libraries). The code is not well tested outside 
+of the immediate bug being fixed.
+- The bug is not clearly a high priority for the community.
+- The backported fix is widely different from the master branch fix.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/community.md
----------------------------------------------------------------------
diff --git a/community.md b/community.md
index c4f83a5..3bff6ad 100644
--- a/community.md
+++ b/community.md
@@ -191,7 +191,7 @@ Spark Meetups are grass-roots events organized and hosted by leaders and champio
 
 <h3>Powered By</h3>
 
-<p>Our wiki has a list of <a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">projects and organizations powered by Spark</a>.</p>
+<p>Our wiki has a list of <a href="{{site.baseurl}}/powered-by.html">projects and organizations powered by Spark</a>.</p>
 
 <a name="history"></a>
 <h3>Project History</h3>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/contributing.md
----------------------------------------------------------------------
diff --git a/contributing.md b/contributing.md
new file mode 100644
index 0000000..5ee066f
--- /dev/null
+++ b/contributing.md
@@ -0,0 +1,523 @@
+---
+layout: global
+title: Contributing to Spark
+type: "page singular"
+navigation:
+  weight: 5
+  show: true
+---
+
+This guide documents the best way to make various types of contribution to Apache Spark, 
+including what is required before submitting a code change.
+
+Contributing to Spark doesn't just mean writing code. Helping new users on the mailing list, 
+testing releases, and improving documentation are also welcome. In fact, proposing significant 
+code changes usually requires first gaining experience and credibility within the community by h
+elping in other ways. This is also a guide to becoming an effective contributor.
+
+So, this guide organizes contributions in order that they should probably be considered by new 
+contributors who intend to get involved long-term. Build some track record of helping others, 
+rather than just open pull requests.
+
+<h2>Contributing by Helping Other Users</h2>
+
+A great way to contribute to Spark is to help answer user questions on the `user@spark.apache.org` 
+mailing list or on StackOverflow. There are always many new Spark users; taking a few minutes to 
+help answer a question is a very valuable community service.
+
+Contributors should subscribe to this list and follow it in order to keep up to date on what's 
+happening in Spark. Answering questions is an excellent and visible way to help the community, 
+which also demonstrates your expertise.
+
+See the <a href="{{site.baseurl}}/mailing-lists.html">Mailing Lists guide</a> for guidelines 
+about how to effectively participate in discussions on the mailing list, as well as forums 
+like StackOverflow.
+
+<h2>Contributing by Testing Releases</h2>
+
+Spark's release process is community-oriented, and members of the community can vote on new 
+releases on the `dev@spark.apache.org` mailing list. Spark users are invited to subscribe to 
+this list to receive announcements, and test their workloads on newer release and provide 
+feedback on any performance or correctness issues found in the newer release.
+
+<h2>Contributing by Reviewing Changes</h2>
+
+Changes to Spark source code are proposed, reviewed and committed via 
+<a href="http://github.com/apache/spark/pulls">Github pull requests</a> (described later). 
+Anyone can view and comment on active changes here. 
+Reviewing others' changes is a good way to learn how the change process works and gain exposure 
+to activity in various parts of the code. You can help by reviewing the changes and asking 
+questions or pointing out issues -- as simple as typos or small issues of style.
+See also https://spark-prs.appspot.com/ for a convenient way to view and filter open PRs.
+
+<h2>Contributing Documentation Changes</h2>
+
+To propose a change to _release_ documentation (that is, docs that appear under 
+<a href="https://spark.apache.org/docs/">https://spark.apache.org/docs/</a>), 
+edit the Markdown source files in Spark's 
+<a href="https://github.com/apache/spark/tree/master/docs">`docs/`</a> directory, 
+whose `README` file shows how to build the documentation locally to test your changes.
+The process to propose a doc change is otherwise the same as the process for proposing code 
+changes below. 
+
+To propose a change to the rest of the documentation (that is, docs that do _not_ appear under 
+<a href="https://spark.apache.org/docs/">https://spark.apache.org/docs/</a>), similarly, edit the Markdown in the 
+<a href="https://github.com/apache/spark-website">spark-website repository</a> and open a pull request.
+
+<h2>Contributing User Libraries to Spark</h2>
+
+Just as Java and Scala applications can access a huge selection of libraries and utilities, 
+none of which are part of Java or Scala themselves, Spark aims to support a rich ecosystem of 
+libraries. Many new useful utilities or features belong outside of Spark rather than in the core. 
+For example: language support probably has to be a part of core Spark, but, useful machine 
+learning algorithms can happily exist outside of MLlib.
+
+To that end, large and independent new functionality is often rejected for inclusion in Spark 
+itself, but, can and should be hosted as a separate project and repository, and included in 
+the <a href="http://spark-packages.org/">spark-packages.org</a> collection.
+
+<h2>Contributing Bug Reports</h2>
+
+Ideally, bug reports are accompanied by a proposed code change to fix the bug. This isn't 
+always possible, as those who discover a bug may not have the experience to fix it. A bug 
+may be reported by creating a JIRA but without creating a pull request (see below).
+
+Bug reports are only useful however if they include enough information to understand, isolate 
+and ideally reproduce the bug. Simply encountering an error does not mean a bug should be 
+reported; as below, search JIRA and search and inquire on the Spark user / dev mailing lists 
+first. Unreproducible bugs, or simple error reports, may be closed.
+
+It is possible to propose new features as well. These are generally not helpful unless 
+accompanied by detail, such as a design document and/or code change. Large new contributions 
+should consider <a href="http://spark-packages.org/">spark-packages.org</a> first (see above), 
+or be discussed on the mailing 
+list first. Feature requests may be rejected, or closed after a long period of inactivity.
+
+<h2>Contributing to JIRA Maintenance</h2>
+
+Given the sheer volume of issues raised in the Apache Spark JIRA, inevitably some issues are 
+duplicates, or become obsolete and eventually fixed otherwise, or can't be reproduced, or could 
+benefit from more detail, and so on. It's useful to help identify these issues and resolve them, 
+either by advancing the discussion or even resolving the JIRA. Most contributors are able to 
+directly resolve JIRAs. Use judgment in determining whether you are quite confident the issue 
+should be resolved, although changes can be easily undone. If in doubt, just leave a comment 
+on the JIRA.
+
+When resolving JIRAs, observe a few useful conventions:
+
+- Resolve as **Fixed** if there's a change you can point to that resolved the issue
+  - Set Fix Version(s), if and only if the resolution is Fixed
+  - Set Assignee to the person who most contributed to the resolution, which is usually the person 
+  who opened the PR that resolved the issue.
+  - In case several people contributed, prefer to assign to the more 'junior', non-committer contributor
+- For issues that can't be reproduced against master as reported, resolve as **Cannot Reproduce**
+  - Fixed is reasonable too, if it's clear what other previous pull request resolved it. Link to it.
+- If the issue is the same as or a subset of another issue, resolved as **Duplicate**
+  - Make sure to link to the JIRA it duplicates
+  - Prefer to resolve the issue that has less activity or discussion as the duplicate
+- If the issue seems clearly obsolete and applies to issues or components that have changed 
+radically since it was opened, resolve as **Not a Problem**
+- If the issue doesn't make sense – not actionable, for example, a non-Spark issue, resolve 
+as **Invalid**
+- If it's a coherent issue, but there is a clear indication that there is not support or interest 
+in acting on it, then resolve as **Won't Fix**
+- Umbrellas are frequently marked **Done** if they are just container issues that don't correspond 
+to an actionable change of their own
+
+<h2>Preparing to Contribute Code Changes</h2>
+
+<h3>Choosing What to Contribute</h3>
+
+Spark is an exceptionally busy project, with a new JIRA or pull request every few hours on average. 
+Review can take hours or days of committer time. Everyone benefits if contributors focus on 
+changes that are useful, clear, easy to evaluate, and already pass basic checks.
+
+Sometimes, a contributor will already have a particular new change or bug in mind. If seeking 
+ideas, consult the list of starter tasks in JIRA, or ask the `user@spark.apache.org` mailing list.
+
+Before proceeding, contributors should evaluate if the proposed change is likely to be relevant, 
+new and actionable:
+
+- Is it clear that code must change? Proposing a JIRA and pull request is appropriate only when a 
+clear problem or change has been identified. If simply having trouble using Spark, use the mailing 
+lists first, rather than consider filing a JIRA or proposing a change. When in doubt, email 
+`user@spark.apache.org` first about the possible change
+- Search the `user@spark.apache.org` and `dev@spark.apache.org` mailing list 
+<a href="{{site.baseurl}}/community.html#mailing-lists">archives</a> for 
+related discussions. Use <a href="http://search-hadoop.com/?q=&fc_project=Spark">search-hadoop.com</a> 
+or similar search tools. 
+Often, the problem has been discussed before, with a resolution that doesn't require a code 
+change, or recording what kinds of changes will not be accepted as a resolution.
+- Search JIRA for existing issues: 
+<a href="https://issues.apache.org/jira/browse/SPARK">https://issues.apache.org/jira/browse/SPARK</a> 
+- Type `spark [search terms]` at the top right search box. If a logically similar issue already 
+exists, then contribute to the discussion on the existing JIRA and pull request first, instead of 
+creating a new one.
+- Is the scope of the change matched to the contributor's level of experience? Anyone is qualified 
+to suggest a typo fix, but refactoring core scheduling logic requires much more understanding of 
+Spark. Some changes require building up experience first (see above).
+
+<h3>MLlib-specific Contribution Guidelines</h3>
+
+While a rich set of algorithms is an important goal for MLLib, scaling the project requires 
+that maintainability, consistency, and code quality come first. New algorithms should:
+
+- Be widely known
+- Be used and accepted (academic citations and concrete use cases can help justify this)
+- Be highly scalable
+- Be well documented
+- Have APIs consistent with other algorithms in MLLib that accomplish the same thing
+- Come with a reasonable expectation of developer support.
+- Have `@Since` annotation on public classes, methods, and variables.
+
+<h3>Code Review Criteria</h3>
+
+Before considering how to contribute code, it's useful to understand how code is reviewed, 
+and why changes may be rejected. Simply put, changes that have many or large positives, and 
+few negative effects or risks, are much more likely to be merged, and merged quickly. 
+Risky and less valuable changes are very unlikely to be merged, and may be rejected outright 
+rather than receive iterations of review.
+
+<h4>Positives</h4>
+
+- Fixes the root cause of a bug in existing functionality
+- Adds functionality or fixes a problem needed by a large number of users
+- Simple, targeted
+- Maintains or improves consistency across Python, Java, Scala
+- Easily tested; has tests
+- Reduces complexity and lines of code
+- Change has already been discussed and is known to committers
+
+<h4>Negatives, Risks</h4>
+
+- Band-aids a symptom of a bug only
+- Introduces complex new functionality, especially an API that needs to be supported
+- Adds complexity that only helps a niche use case
+- Adds user-space functionality that does not need to be maintained in Spark, but could be hosted 
+externally and indexed by <a href="http://spark-packages.org/">spark-packages.org</a> 
+- Changes a public API or semantics (rarely allowed)
+- Adds large dependencies
+- Changes versions of existing dependencies
+- Adds a large amount of code
+- Makes lots of modifications in one "big bang" change
+
+<h2>Contributing Code Changes</h2>
+
+Please review the preceding section before proposing a code change. This section documents how to do so.
+
+**When you contribute code, you affirm that the contribution is your original work and that you 
+license the work to the project under the project's open source license. Whether or not you state 
+this explicitly, by submitting any copyrighted material via pull request, email, or other means 
+you agree to license the material under the project's open source license and warrant that you 
+have the legal authority to do so.**
+
+<h3>JIRA</h3>
+
+Generally, Spark uses JIRA to track logical issues, including bugs and improvements, and uses 
+Github pull requests to manage the review and merge of specific code changes. That is, JIRAs are 
+used to describe _what_ should be fixed or changed, and high-level approaches, and pull requests 
+describe _how_ to implement that change in the project's source code. For example, major design 
+decisions are discussed in JIRA.
+
+1. Find the existing Spark JIRA that the change pertains to.
+    1. Do not create a new JIRA if creating a change to address an existing issue in JIRA; add to 
+    the existing discussion and work instead
+    1. Look for existing pull requests that are linked from the JIRA, to understand if someone is 
+    already working on the JIRA
+1. If the change is new, then it usually needs a new JIRA. However, trivial changes, where the
+what should change is virtually the same as the how it should change do not require a JIRA. 
+Example: `Fix typos in Foo scaladoc`
+1. If required, create a new JIRA:
+    1. Provide a descriptive Title. "Update web UI" or "Problem in scheduler" is not sufficient.
+    "Kafka Streaming support fails to handle empty queue in YARN cluster mode" is good.
+    1. Write a detailed Description. For bug reports, this should ideally include a short 
+    reproduction of the problem. For new features, it may include a design document.
+    1. Set required fields:
+        1. **Issue Type**. Generally, Bug, Improvement and New Feature are the only types used in Spark.
+        1. **Priority**. Set to Major or below; higher priorities are generally reserved for 
+        committers to set. JIRA tends to unfortunately conflate "size" and "importance" in its 
+        Priority field values. Their meaning is roughly:
+             1. Blocker: pointless to release without this change as the release would be unusable 
+             to a large minority of users
+             1. Critical: a large minority of users are missing important functionality without 
+             this, and/or a workaround is difficult
+             1. Major: a small minority of users are missing important functionality without this, 
+             and there is a workaround
+             1. Minor: a niche use case is missing some support, but it does not affect usage or 
+             is easily worked around
+             1. Trivial: a nice-to-have change but unlikely to be any problem in practice otherwise 
+        1. **Component**
+        1. **Affects Version**. For Bugs, assign at least one version that is known to exhibit the 
+        problem or need the change
+    1. Do not set the following fields:
+        1. **Fix Version**. This is assigned by committers only when resolved.
+        1. **Target Version**. This is assigned by committers to indicate a PR has been accepted for 
+        possible fix by the target version.
+    1. Do not include a patch file; pull requests are used to propose the actual change.
+1. If the change is a large change, consider inviting discussion on the issue at 
+`dev@spark.apache.org` first before proceeding to implement the change.
+
+<h3>Pull Request</h3>
+
+1. <a href="https://help.github.com/articles/fork-a-repo/">Fork</a> the Github repository at 
+<a href="http://github.com/apache/spark">http://github.com/apache/spark</a> if you haven't already
+1. Clone your fork, create a new branch, push commits to the branch.
+1. Consider whether documentation or tests need to be added or updated as part of the change, 
+and add them as needed.
+1. Run all tests with `./dev/run-tests` to verify that the code still compiles, passes tests, and 
+passes style checks. If style checks fail, review the Code Style Guide below.
+1. <a href="https://help.github.com/articles/using-pull-requests/">Open a pull request</a> against 
+the `master` branch of `apache/spark`. (Only in special cases would the PR be opened against other branches.)
+     1. The PR title should be of the form `[SPARK-xxxx][COMPONENT] Title`, where `SPARK-xxxx` is 
+     the relevant JIRA number, `COMPONENT `is one of the PR categories shown at 
+     <a href="https://spark-prs.appspot.com/">spark-prs.appspot.com</a> and 
+     Title may be the JIRA's title or a more specific title describing the PR itself.
+     1. If the pull request is still a work in progress, and so is not ready to be merged, 
+     but needs to be pushed to Github to facilitate review, then add `[WIP]` after the component.
+     1. Consider identifying committers or other contributors who have worked on the code being 
+     changed. Find the file(s) in Github and click "Blame" to see a line-by-line annotation of 
+     who changed the code last. You can add `@username` in the PR description to ping them 
+     immediately.
+     1. Please state that the contribution is your original work and that you license the work 
+     to the project under the project's open source license.
+1. The related JIRA, if any, will be marked as "In Progress" and your pull request will 
+automatically be linked to it. There is no need to be the Assignee of the JIRA to work on it, 
+though you are welcome to comment that you have begun work.
+1. The Jenkins automatic pull request builder will test your changes
+     1. If it is your first contribution, Jenkins will wait for confirmation before building 
+     your code and post "Can one of the admins verify this patch?"
+     1. A committer can authorize testing with a comment like "ok to test"
+     1. A committer can automatically allow future pull requests from a contributor to be 
+     tested with a comment like "Jenkins, add to whitelist"
+1. After about 2 hours, Jenkins will post the results of the test to the pull request, along 
+with a link to the full results on Jenkins.
+1. Watch for the results, and investigate and fix failures promptly
+     1. Fixes can simply be pushed to the same branch from which you opened your pull request
+     1. Jenkins will automatically re-test when new commits are pushed
+     1. If the tests failed for reasons unrelated to the change (e.g. Jenkins outage), then a 
+     committer can request a re-test with "Jenkins, retest this please". 
+     Ask if you need a test restarted.
+
+<h3>The Review Process</h3>
+
+- Other reviewers, including committers, may comment on the changes and suggest modifications. 
+Changes can be added by simply pushing more commits to the same branch.
+- Lively, polite, rapid technical debate is encouraged from everyone in the community. The outcome 
+may be a rejection of the entire change.
+- Reviewers can indicate that a change looks suitable for merging with a comment such as: "I think 
+this patch looks good". Spark uses the LGTM convention for indicating the strongest level of 
+technical sign-off on a patch: simply comment with the word "LGTM". It specifically means: "I've 
+looked at this thoroughly and take as much ownership as if I wrote the patch myself". If you 
+comment LGTM you will be expected to help with bugs or follow-up issues on the patch. Consistent, 
+judicious use of LGTMs is a great way to gain credibility as a reviewer with the broader community.
+- Sometimes, other changes will be merged which conflict with your pull request's changes. The 
+PR can't be merged until the conflict is resolved. This can be resolved with `git fetch origin` 
+followed by `git merge origin/master` and resolving the conflicts by hand, then pushing the result 
+to your branch.
+- Try to be responsive to the discussion rather than let days pass between replies
+
+<h3>Closing Your Pull Request / JIRA</h3>
+
+- If a change is accepted, it will be merged and the pull request will automatically be closed, 
+along with the associated JIRA if any
+  - Note that in the rare case you are asked to open a pull request against a branch besides 
+  `master`, that you will actually have to close the pull request manually
+  - The JIRA will be Assigned to the primary contributor to the change as a way of giving credit. 
+  If the JIRA isn't closed and/or Assigned promptly, comment on the JIRA.
+- If your pull request is ultimately rejected, please close it promptly
+  - ... because committers can't close PRs directly
+  - Pull requests will be automatically closed by an automated process at Apache after about a 
+  week if a committer has made a comment like "mind closing this PR?" This means that the 
+  committer is specifically requesting that it be closed.
+- If a pull request has gotten little or no attention, consider improving the description or 
+the change itself and ping likely reviewers again after a few days. Consider proposing a 
+change that's easier to include, like a smaller and/or less invasive change.
+- If it has been reviewed but not taken up after weeks, after soliciting review from the 
+most relevant reviewers, or, has met with neutral reactions, the outcome may be considered a 
+"soft no". It is helpful to withdraw and close the PR in this case.
+- If a pull request is closed because it is deemed not the right approach to resolve a JIRA, 
+then leave the JIRA open. However if the review makes it clear that the issue identified in 
+the JIRA is not going to be resolved by any pull request (not a problem, won't fix) then also 
+resolve the JIRA.
+
+<a name="code-style-guide"></a>
+<h2>Code Style Guide</h2>
+
+Please follow the style of the existing codebase.
+
+- For Python code, Apache Spark follows 
+<a href="http://legacy.python.org/dev/peps/pep-0008/">PEP 8</a> with one exception: 
+lines can be up to 100 characters in length, not 79.
+- For Java code, Apache Spark follows 
+<a href="http://www.oracle.com/technetwork/java/codeconvtoc-136057.html">Oracle's Java code conventions</a>. 
+Many Scala guidelines below also apply to Java.
+- For Scala code, Apache Spark follows the official 
+<a href="http://docs.scala-lang.org/style/">Scala style guide</a>, but with the following changes, below.
+
+<h3>Line Length</h3>
+
+Limit lines to 100 characters. The only exceptions are import statements (although even for 
+those, try to keep them under 100 chars).
+
+<h3>Indentation</h3>
+
+Use 2-space indentation in general. For function declarations, use 4 space indentation for its 
+parameters when they don't fit in a single line. For example:
+
+```scala
+// Correct:
+if (true) {
+  println("Wow!")
+}
+ 
+// Wrong:
+if (true) {
+    println("Wow!")
+}
+ 
+// Correct:
+def newAPIHadoopFile[K, V, F <: NewInputFormat[K, V]](
+    path: String,
+    fClass: Class[F],
+    kClass: Class[K],
+    vClass: Class[V],
+    conf: Configuration = hadoopConfiguration): RDD[(K, V)] = {
+  // function body
+}
+ 
+// Wrong
+def newAPIHadoopFile[K, V, F <: NewInputFormat[K, V]](
+  path: String,
+  fClass: Class[F],
+  kClass: Class[K],
+  vClass: Class[V],
+  conf: Configuration = hadoopConfiguration): RDD[(K, V)] = {
+  // function body
+}
+```
+
+<h3>Code documentation style</h3>
+
+For Scala doc / Java doc comment before classes, objects and methods, use Java docs style 
+instead of Scala docs style.
+
+```scala
+/** This is a correct one-liner, short description. */
+ 
+/**
+ * This is correct multi-line JavaDoc comment. And
+ * this is my second line, and if I keep typing, this would be
+ * my third line.
+ */
+ 
+/** In Spark, we don't use the ScalaDoc style so this
+  * is not correct.
+  */
+```
+ 
+For inline comment with the code, use `//` and not `/*  .. */`.
+
+```scala
+// This is a short, single line comment
+ 
+// This is a multi line comment.
+// Bla bla bla
+ 
+/*
+ * Do not use this style for multi line comments. This
+ * style of comment interferes with commenting out
+ * blocks of code, and also makes code comments harder
+ * to distinguish from Scala doc / Java doc comments.
+ */
+ 
+/**
+ * Do not use scala doc style for inline comments.
+ */
+```
+
+<h3>Imports</h3>
+
+Always import packages using absolute paths (e.g. `scala.util.Random`) instead of relative ones 
+(e.g. `util.Random`). In addition, sort imports in the following order 
+(use alphabetical order within each group):
+- `java.*` and `javax.*`
+- `scala.*`
+- Third-party libraries (`org.*`, `com.*`, etc)
+- Project classes (`org.apache.spark.*`)
+
+The <a href="https://plugins.jetbrains.com/plugin/7350">IntelliJ import organizer plugin</a> 
+can organize imports for you. Use this configuration for the plugin (configured under 
+Preferences / Editor / Code Style / Scala Imports Organizer):
+
+```scala
+import java.*
+import javax.*
+ 
+import scala.*
+ 
+import *
+ 
+import org.apache.spark.*
+```
+
+<h3>Infix Methods</h3>
+
+Don't use infix notation for methods that aren't operators. For example, instead of 
+`list map func`, use `list.map(func)`, or instead of `string contains "foo"`, use 
+`string.contains("foo")`. This is to improve familiarity to developers coming from other languages.
+
+<h3>Curly Braces</h3>
+
+Put curly braces even around one-line `if`, `else` or loop statements. The only exception is if 
+you are using `if/else` as an one-line ternary operator.
+
+```scala
+// Correct:
+if (true) {
+  println("Wow!")
+}
+ 
+// Correct:
+if (true) statement1 else statement2
+ 
+// Wrong:
+if (true)
+  println("Wow!")
+```
+
+<h3>Return Types</h3>
+
+Always specify the return types of methods where possible. If a method has no return type, specify 
+`Unit` instead in accordance with the Scala style guide. Return types for variables are not 
+required unless the definition involves huge code blocks with potentially ambiguous return values.
+
+```scala
+// Correct:
+def getSize(partitionId: String): Long = { ... }
+def compute(partitionId: String): Unit = { ... }
+ 
+// Wrong:
+def getSize(partitionId: String) = { ... }
+def compute(partitionId: String) = { ... }
+def compute(partitionId: String) { ... }
+ 
+// Correct:
+val name = "black-sheep"
+val path: Option[String] =
+  try {
+    Option(names)
+      .map { ns => ns.split(",") }
+      .flatMap { ns => ns.filter(_.nonEmpty).headOption }
+      .map { n => "prefix" + n + "suffix" }
+      .flatMap { n => if (n.hashCode % 3 == 0) Some(n + n) else None }
+  } catch {
+    case e: SomeSpecialException =>
+      computePath(names)
+  }
+```
+
+<h3>If in Doubt</h3>
+
+If you're not sure about the right style for something, try to follow the style of the existing 
+codebase. Look at whether there are other examples in the code that use your feature. Feel free 
+to ask on the `dev@spark.apache.org` list as well.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/documentation.md
----------------------------------------------------------------------
diff --git a/documentation.md b/documentation.md
index 0fa10c2..465f432 100644
--- a/documentation.md
+++ b/documentation.md
@@ -178,7 +178,7 @@ Slides, videos and EC2-based exercises from each of these are available online:
 
 <ul><li>
 The <a href="https://cwiki.apache.org/confluence/display/SPARK/Wiki+Homepage">Spark wiki</a> contains
-information for developers, such as architecture documents and how to <a href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark">contribute</a> to Spark.
+information for developers, such as architecture documents and how to <a href="{{site.baseurl}}/contributing.html">">contribute</a> to Spark.
 </li></ul>
 
 <h3>Research Papers</h3>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/faq.md
----------------------------------------------------------------------
diff --git a/faq.md b/faq.md
index 8d048aa..7b2fa15 100644
--- a/faq.md
+++ b/faq.md
@@ -15,7 +15,7 @@ Spark is a fast and general processing engine compatible with Hadoop data. It ca
 
 <p class="question">Who is using Spark in production?</p>
 
-<p class="answer">As of 2016, surveys show that more than 1000 organizations are using Spark in production. Some of them are listed on the <a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">Powered By page</a> and at the <a href="http://spark-summit.org">Spark Summit</a>.</p>
+<p class="answer">As of 2016, surveys show that more than 1000 organizations are using Spark in production. Some of them are listed on the <a href="{{site.baseurl}}/powered-by.html">Powered By page</a> and at the <a href="http://spark-summit.org">Spark Summit</a>.</p>
 
 
 <p class="question">How large a cluster can Spark scale to?</p>
@@ -67,7 +67,7 @@ Please also refer to our
 
 <p class="question">How can I contribute to Spark?</p>
 
-<p class="answer">See the <a href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark">Contributing to Spark wiki</a> for more information.</p>
+<p class="answer">See the <a href="{{site.baseurl}}/contributing.html">Contributing to Spark wiki</a> for more information.</p>
 
 <p class="question">Where can I get more help?</p>
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/graphx/index.md
----------------------------------------------------------------------
diff --git a/graphx/index.md b/graphx/index.md
index a3aa8d2..dd283ef 100644
--- a/graphx/index.md
+++ b/graphx/index.md
@@ -87,7 +87,7 @@ subproject: GraphX
     </p>
     <p>
       GraphX is in the alpha stage and welcomes contributions. If you'd like to submit a change to GraphX,
-      read <a href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark">how to
+      read <a href="{{site.baseurl}}/contributing.html">how to
       contribute to Spark</a> and send us a patch!
     </p>
   </div>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/index.md
----------------------------------------------------------------------
diff --git a/index.md b/index.md
index 14185d2..b20a4b0 100644
--- a/index.md
+++ b/index.md
@@ -130,9 +130,7 @@ navigation:
     <p>
       Spark is used at a wide range of organizations to process large datasets.
       You can find example use cases at the <a href="http://spark-summit.org/summit-2013/">Spark Summit</a>
-      conference, or on the
-      <a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">Powered By</a>
-      page.
+      conference, or on the <a href="{{site.baseurl}}/powered-by.html">Powered By</a> page.
     </p>
 
     <p>
@@ -156,14 +154,13 @@ navigation:
 
     <p>
       The project's
-      <a href="https://cwiki.apache.org/confluence/display/SPARK/Committers">committers</a>
+      <a href="{{site.baseurl}}/committers.html">committers</a>
       come from 19 organizations.
     </p>
 
     <p>
       If you'd like to participate in Spark, or contribute to the libraries on top of it, learn
-      <a href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark">how to
-        contribute</a>.
+      <a href="{{site.baseurl}}/contributing.html">how to contribute</a>.
     </p>
   </div>
 

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/mllib/index.md
----------------------------------------------------------------------
diff --git a/mllib/index.md b/mllib/index.md
index 61e65a8..9c43750 100644
--- a/mllib/index.md
+++ b/mllib/index.md
@@ -114,7 +114,7 @@ subproject: MLlib
     </p>
     <p>
       MLlib is still a rapidly growing project and welcomes contributions. If you'd like to submit an algorithm to MLlib,
-      read <a href="https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark">how to
+      read <a href="{{site.baseurl}}/contributing.html">how to
       contribute to Spark</a> and send us a patch!
     </p>
   </div>

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md
----------------------------------------------------------------------
diff --git a/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md b/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md
index 8a06597..542610a 100644
--- a/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md
+++ b/news/_posts/2013-09-05-spark-user-survey-and-powered-by-page.md
@@ -13,6 +13,6 @@ meta:
 ---
 As we continue developing Spark, we would love to get feedback from users and hear what you'd like us to work on next. We've decided that a good way to do that is a survey -- we hope to run this at regular intervals. If you have a few minutes to participate, <a href="https://docs.google.com/forms/d/1eMXp4GjcIXglxJe5vYYBzXKVm-6AiYt1KThJwhCjJiY/viewform">fill in the survey here</a>. Your time is greatly appreciated.
 
-In parallel, we are starting a <a href="https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark">"powered by" page</a> on the Apache Spark wiki for organizations that are using, or contributing to, Spark. Sign up if you'd like to support the project! This is a great way to let the world know you're using Spark, and can also be helpful to generate leads for recruiting. You can also add yourself when you fill the survey.
+In parallel, we are starting a <a href="{{site.baseurl}}/powered-by.html">"powered by" page</a> on the Apache Spark wiki for organizations that are using, or contributing to, Spark. Sign up if you'd like to support the project! This is a great way to let the world know you're using Spark, and can also be helpful to generate leads for recruiting. You can also add yourself when you fill the survey.
 
 Thanks for taking the time to give feedback.

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/powered-by.md
----------------------------------------------------------------------
diff --git a/powered-by.md b/powered-by.md
new file mode 100644
index 0000000..5ecfafb
--- /dev/null
+++ b/powered-by.md
@@ -0,0 +1,239 @@
+---
+layout: global
+title: Powered By Spark
+type: "page singular"
+navigation:
+  weight: 5
+  show: true
+---
+
+<h2>Project and Product names using "Spark"</h2>
+
+Organizations creating products and projects for use with Apache Spark, along with associated 
+marketing materials, should take care to respect the trademark in "Apache Spark" and its logo. 
+Please refer to <a href="http://www.apache.org/foundation/marks/">ASF Trademarks Guidance</a> and 
+associated <a href="http://www.apache.org/foundation/marks/faq/">FAQ</a> 
+for comprehensive and authoritative guidance on proper usage of ASF trademarks.
+
+Names that do not include "Spark" at all have no potential trademark issue with the Spark project. 
+This is recommended.
+
+Names like "Spark BigCoProduct" are not OK, as are names including "Spark" in general. 
+The above links, however, describe some exceptions, like for names such as "BigCoProduct, 
+powered by Apache Spark" or "BigCoProduct for Apache Spark".
+
+It is common practice to create software identifiers (Maven coordinates, module names, etc.) 
+like "spark-foo". These are permitted. Nominative use of trademarks in descriptions is also 
+always allowed, as in "BigCoProduct is a widget for Apache Spark".
+
+<h2>Companies and Organizations</h2>
+
+To add yourself to the list, please email `dev@spark.apache.org` with your organization name, URL, 
+a list of which Spark components you are using, and a short description of your use case.
+
+- <a href="http://amplab.cs.berkeley.edu">UC Berkeley AMPLab</a> - Big data research lab that 
+initially launched Spark
+  - We're building a variety of open source projects on Spark
+  - We have both graduate students and a team of professional software engineers working on the stack
+- <a href="http://4quant.com">4Quant</a>
+- <a href="http://www.actnowib.com">Act Now</a>
+  - Spark powers NOW APPS, a big data, real-time, predictive analytics platform. We use Spark SQL, 
+  MLlib and GraphX components for both batch ETL and analytics applied to telecommunication data, 
+  providing faster and more meaningful insights and actionable data to the operators.
+- <a href="http://adatao.com">Adatao, Inc.</a> - Data Intelligence for All
+  - Visual, Real-Time, Predictive Analytics on Spark+Hadoop, with built-in support for R, Python, 
+  SQL, and Natural Language.
+  - Team of ex-Googlers and Yahoos with large-scale infrastructure experience 
+  (including both flavors of MapReduce at Google and Yahoo) and PhD's in ML/Data Mining
+  - Determined that Spark, among the many alternatives, answered the right problem statements with 
+  the right design
+- <a href="http://www.agilelab.it">Agile Lab</a>
+  - enhancing big data. 360 customer view, log analysis, BI
+- <a href="http://www.taobao.com/">Alibaba Taobao</a>
+  - We built one of the world's first Spark on YARN production clusters.
+  - See our blog posts (in Chinese) about Spark at Taobao: 
+  <a href="http://rdc.taobao.org/?tag=spark">http://rdc.taobao.org/?tag=spark</a>
+- <a href="http://alpinenow.com/">Alpine Data Labs</a>
+- <a href="http://amazon.com">Amazon</a>
+- <a href="http://www.amrita.edu/cyber/">Amrita Center for Cyber Security Systems and Networks</a>
+- <a href="http://www.art.com/">Art.com</a>
+  - Trending analytics and personalization
+- <a href="http://www.asiainfo.com">AsiaInfo</a>
+  - We are using Spark Core, Streaming, MLlib and Graphx. We leverage Spark and Hadoop ecosystem 
+  to build cost effective data center solution for our customer in telco industry as well as 
+  other industrial sectors.
+- <a href="http://www.atigeo.com">Atigeo</a> – integrated Spark in xPatterns, our big data 
+analytics platform, as a replacement for Hadoop MR
+- <a href="https://atp.io">atp</a>
+  - Predictive models and learning algorithms to improve the relevance of programmatic marketing.
+  - Components used: Spark SQL, MLLib.
+- <a href="http://www.autodesk.com">Autodesk</a>
+- <a href="http://www.baidu.com">Baidu</a>
+- <a href="http://www.bakdata.com/">Bakdata</a> – using Spark (and Shark) to perform interactive 
+exploration of large datasets
+- <a href="http://http//www.bigindustries.be/">Big Industries</a> - using Spark Streaming: The 
+Big Content Platform is a business-to-business content asset management service providing a 
+searchable, aggregated source of live news feeds, public domain media and archives of content.
+- <a href="http://www.bizo.com">Bizo</a>
+  - Check out our talk on <a href="http://www.meetup.com/spark-users/events/139804022/">Spark at Bizo</a> 
+  at Spark user meetup
+- <a href="http://www.celtra.com">Celtra</a>
+- <a href="http://www.clearstorydata.com">ClearStory Data</a> – ClearStory's platform and 
+integrated Data Intelligence application leverages Spark to speed analysis across internal 
+and external data sources, driving holistic and actionable insights.
+- <a href="https://www.concur.com">Concur</a>
+  - Spark SQL, MLlib
+  - Using Spark for travel and expenses analytics and personalization<
+- <a href="http://www.contentsquare.com">Content Square</a>
+  - We use Spark to regularly read raw data, convert them into Parquet, and process them to 
+  create advanced analytics dashboards: aggregation, sampling, statistics computations, 
+  anomaly detection, machine learning.
+- <a href="http://www.conviva.com">Conviva</a> – Experience Live
+  - See our talk at <a href="http://ampcamp.berkeley.edu/3/">AmpCamp</a> on how we are 
+  <a href="http://www.youtube.com/watch?feature=player_detailpage&v=YaayAatdRNs">using Spark to 
+  provide real time video optimization</a>
+- <a href="https://www.creditkarma.com/">Credit Karma</a>
+  - We create personalized experiences using Spark.
+- <a href="http://databricks.com">Databricks</a>
+  - Formed by the creators of Apache Spark and Shark, Databricks is working to greatly expand these 
+  open source projects and transform big data analysis in the process. We're deeply committed to 
+  keeping all work on these systems open source.
+  - We provided a hosted service to run Spark, 
+  <a href="http://www.databricks.com/cloud">Databricks Cloud</a>, and partner to 
+  <a href="http://databricks.com/support/">support Apache Spark</a> with other Hadoop and big 
+  data companies.
+- <a href="http://dianping.com">Dianping.com</a>
+- <a href="http://www.digby.com">Digby</a>
+- <a href="http://www.drawbrid.ge/">Drawbridge</a>
+- <a href="http://www.ebay.com/">eBay Inc.</a>
+  - Using Spark core for log transaction aggregation and analytics
+- <a href="http://labs.elsevier.com">Elsevier Labs</a>
+  - Use Case: Building Machine Reading Pipeline, Knowledge Graphs, Content as a Service, Content 
+  and Event Analytics, Content/Event based Predictive Models and Big Data Processing.
+  - We use Scala and Python over Databricks Notebooks for most of our work.
+- <a href="http://www.eurecom.fr/en">EURECOM</a>
+- <a href="http://www.exabeam.com">Exabeam</a>
+- <a href="http://www.faimdata.com/">Faimdata</a>
+  - Build eCommerce and data intelligence solutions to the retail industry on top of 
+  Spark/Shark/Spark Streaming
+- <a href="http://falkonry.com">Falkonry</a>
+- <a href="http://www.flytxt.com">Flytxt</a>
+  - Big Data analytics for subscriber profiling and personalization in telecommunications domain. 
+  We are using Spark Core and MLlib.
+- <a href="http://www.jeremyfreeman.net">Freeman Lab at HHMI</a>
+  - We are using Spark for analyzing and visualizing patterns in large-scale recordings of brain 
+  activity in real time
+- <a href="http://www.fundacionctic.org">Fundacion CTIC</a>
+- <a href="http://graphflow.com">GraphFlow, Inc.</a>
+- <a href="http://www.groupon.com/app/subscriptions/new_zip?division_p=san-francisco">Groupon</a>
+- <a href="http://www.guavus.com/">Guavus</a>
+  - Stream processing of network machine data
+- <a href="http://www.hitachi-solutions.com/">Hitachi Solutions</a>
+- <a href="http://hivedata.com/">The Hive</a>
+- <a href="http://www.research.ibm.com/labs/almaden/index.shtml">IBM Almaden</a>
+- <a href="http://www.infoobjects.com">InfoObjects</a>
+  - Award winning Big Data consulting company with focus on Spark and Hadoop
+- <a href="http://en.inspur.com">Inspur</a>
+- <a href="http://www.sehir.edu.tr/en/">Istanbul Sehir University</a>
+- <a href="http://www.kenshoo.com/">Kenshoo</a>
+  - Digital marketing solutions and predictive media optimization
+- <a href="http://www.kelkoo.co.uk">Kelkoo</a>
+  - Using Spark Core, SQL, and Streaming. Product recommendations, BI and analytics, 
+  real-time malicious activity filtering, and data mining.
+- <a href="http://www.knoldus.com">Knoldus Software LLC</a>
+- <a href="http://eng.localytics.com">Localytics</a>
+  - Batch, real-time, and predictive analytics driving our mobile app analytics and marketing 
+  automation product.
+  - Components used: Spark, Spark Streaming, MLLib.
+- <a href="http://magine.com">Magine TV</a>
+- <a href="http://mediacrossing.com">MediaCrossing</a> – Digital Media Trading Experts in the 
+New York and Boston areas
+  - We are using Spark as a drop-in replacement for Hadoop Map/Reduce to get the right answer 
+  to our queries in a much shorter amount of time.
+- <a href="http://www.myfitnesspal.com/">MyFitnessPal</a>
+  - Using Spark to clean-up user entered food data using both explicit and implicit user signals 
+  with the final goal of identifying high-quality food items.
+  - Using Spark to build different recommendation systems for recipes and foods.
+- <a href="http://deepspace.jpl.nasa.gov/">NASA JPL - Deep Space Network</a>
+- <a href="http://www.163.com/">Netease</a>
+- <a href="http://www.nflabs.com">NFLabs</a>
+- <a href="http://nsn.com">Nokia Solutions and Networks</a>
+- <a href="http://www.nttdata.com/global/en/">NTT DATA</a>
+- <a href="http://www.nubetech.co">Nube Technologies</a>
+  - Nube provides solutions for data curation at scale helping customer targeting, accurate 
+  inventory and efficient analysis.
+- <a href="http://ooyala.com">Ooyala, Inc.</a> – Powering personalized video experiences 
+across all screens
+  - See our blog post on how we use 
+  <a href="http://engineering.ooyala.com/blog/fast-spark-queries-memory-datasets">Spark for 
+  Fast Queries</a>
+  - See our presentation on 
+  <a href="http://www.slideshare.net/EvanChan2/cassandra2013-spark-talk-final">Cassandra, Spark, 
+  and Shark</a>
+- <a href="http://www.opentable.com/">Opentable</a>
+  - Using Apache Spark for log processing and ETL. The data obtained feeds the recommender 
+  system powered by Spark MLLIB Matrix Factorization. We are evaluating the use of Spark 
+  Streaming for real-time analytics.
+- <a href="http://pantera.io">PanTera</a>
+  - PanTera is a tool for exploring large datasets. It uses Spark to create XY and geographic 
+  scatterplots from millions to billions of datapoints.
+  - Components we are using: Spark Core (Scala API), Spark SQL, and GraphX
+- <a href="http://www.peerialism.com">Peerialism</a>
+- <a href="http://www.planbmedia.com">PlanBMedia</a>
+- <a href="http://prediction.io/">PredicitionIo</a> - PredictionIO currently offers two engine 
+templates for Apache Spark MLlib for recommendation (MLlib ALS) and classification (MLlib Naive 
+Bayes). With these templates, you can create a custom predictive engine for production deployment 
+efficiently.
+- <a href="http://premise.com">Premise</a>
+- <a href="http://www.quantifind.com">Quantifind</a>
+- <a href="http://radius.com">Radius Intelligence</a>
+  - Using Scala, Spark and MLLib for Radius Marketing and Sales intelligence platform including 
+  data aggregation, data processing, data clustering, data analysis and predictive modeling of all 
+  US businesses.
+- <a href="http://www.realimpactanalytics.com/">Real Impact Analytics</a>
+  - Building large scale analytics platforms for telecoms operators
+- <a href="http://rocketfuel.com/">RocketFuel</a>
+- <a href="http://www.rondhuit.com/">RONDHUIT</a>
+  - Machine Learning with Apache Mahout and Spark 
+  <a href="http://www.rondhuit.com/services/training/mahout-ML.html">http://www.rondhuit.com/services/training/mahout-ML.html</a>
+- <a href="http://www.sailthru.com/">Sailthru</a>
+  - Uses Spark to build predictive models and recommendation systems for marketing automation 
+  and personalization.
+- <a href="http://www.sisa.samsung.com/">Samsung Research America</a>
+- <a href="http://www.shopify.com/">Shopify</a>
+- <a href="http://www.simba.com/">Simba Technologies</a>
+  - BI/reporting/ETL for Spark and beyond
+- <a href="http://www.sinnia.com">Sinnia</a>
+- <a href="http://www.sktelecom.com/en/main/index.do">SK Telecom</a>
+  - SK Telecom analyses mobile usage patterns of customer with Spark and Shark.
+- <a href="http://socialmetrix.com/">Socialmetrix</a>
+- <a href="http://www.sohu.com">Sohu</a>
+- <a href="http://www.stratio.com/">Stratio</a>
+  - Offers an open-source Big Data platform centered around Apache Spark.
+- <a href="https://www.taboola.com/">Taboola</a> – Powering 'Content You May Like' around the web
+- <a href="http://www.techbase.com.tr">Techbase</a>
+- <a href="http://tencent.com/">Tencent</a>
+- <a href="http://www.tetraconcepts.com/">Tetra Concepts</a>
+- <a href="http://www.trendmicro.com/us/index.html">TrendMicro</a>
+- <a href="http://engineering.tripadvisor.com/using-apache-spark-for-massively-parallel-nlp/">TripAdvisor</a>
+- <a href="http://truedash.io">truedash</a>
+  - Automatic pulling of all your data in to Spark for enterprise visualisation, predictive 
+  analytics and data exploration at a low cost.
+- <a href="http://www.trueffect.com">TruEffect Inc</a>
+- <a href="http://www.tuplejump.com">Tuplejump</a>
+  - Software development partners for Apache Spark and Cassandra projects
+- <a href="http://www.ucsc.edu">UC Santa Cruz</a>
+- <a href="http://missouri.edu/">University of Missouri Data Analytics and Discover Lab</a>
+- <a href="http://videoamp.com/">VideoAmp</a>
+  - Intelligent video ads for online and television viewing audiences.
+- <a href="http://www.vistarmedia.com">Vistar Media</a>
+  - Location technology company enabling brands to reach on-the-go consumers
+- <a href="http://www.yahoo.com">Yahoo!</a>
+- <a href="http://www.yandex.com">Yandex</a>
+  - Using Spark in 
+  <a href="http://www.searchenginejournal.com/yandex-islands-markup-issues-implementation/71891/">Yandex Islands</a>, 
+  to process islands identified from a search robor
+- <a href="http://www.zaloni.com/products/">Zaloni</a>
+  - Zaloni's data lake management platform (Bedrock) and self-service data preparation solution 
+  (Mica) leverage Spark for fast execution of transformations and data exploration.
+  
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/spark-website/blob/0744e8fd/site/committers.html
----------------------------------------------------------------------
diff --git a/site/committers.html b/site/committers.html
new file mode 100644
index 0000000..bad4414
--- /dev/null
+++ b/site/committers.html
@@ -0,0 +1,518 @@
+<!DOCTYPE html>
+<html lang="en">
+<head>
+  <meta charset="utf-8">
+  <meta http-equiv="X-UA-Compatible" content="IE=edge">
+  <meta name="viewport" content="width=device-width, initial-scale=1.0">
+
+  <title>
+     Committers | Apache Spark
+    
+  </title>
+
+  
+
+  
+
+  <!-- Bootstrap core CSS -->
+  <link href="/css/cerulean.min.css" rel="stylesheet">
+  <link href="/css/custom.css" rel="stylesheet">
+
+  <!-- Code highlighter CSS -->
+  <link href="/css/pygments-default.css" rel="stylesheet">
+
+  <script type="text/javascript">
+  <!-- Google Analytics initialization -->
+  var _gaq = _gaq || [];
+  _gaq.push(['_setAccount', 'UA-32518208-2']);
+  _gaq.push(['_trackPageview']);
+  (function() {
+    var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
+    ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
+    var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);
+  })();
+
+  <!-- Adds slight delay to links to allow async reporting -->
+  function trackOutboundLink(link, category, action) {
+    try {
+      _gaq.push(['_trackEvent', category , action]);
+    } catch(err){}
+
+    setTimeout(function() {
+      document.location.href = link.href;
+    }, 100);
+  }
+  </script>
+
+  <!-- HTML5 shim and Respond.js IE8 support of HTML5 elements and media queries -->
+  <!--[if lt IE 9]>
+  <script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
+  <script src="https://oss.maxcdn.com/libs/respond.js/1.3.0/respond.min.js"></script>
+  <![endif]-->
+</head>
+
+<body>
+
+<script src="https://code.jquery.com/jquery.js"></script>
+<script src="https://netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js"></script>
+<script src="/js/lang-tabs.js"></script>
+<script src="/js/downloads.js"></script>
+
+<div class="container" style="max-width: 1200px;">
+
+<div class="masthead">
+  
+    <p class="lead">
+      <a href="/">
+      <img src="/images/spark-logo-trademark.png"
+        style="height:100px; width:auto; vertical-align: bottom; margin-top: 20px;"></a><span class="tagline">
+          Lightning-fast cluster computing
+      </span>
+    </p>
+  
+</div>
+
+<nav class="navbar navbar-default" role="navigation">
+  <!-- Brand and toggle get grouped for better mobile display -->
+  <div class="navbar-header">
+    <button type="button" class="navbar-toggle" data-toggle="collapse"
+            data-target="#navbar-collapse-1">
+      <span class="sr-only">Toggle navigation</span>
+      <span class="icon-bar"></span>
+      <span class="icon-bar"></span>
+      <span class="icon-bar"></span>
+    </button>
+  </div>
+
+  <!-- Collect the nav links, forms, and other content for toggling -->
+  <div class="collapse navbar-collapse" id="navbar-collapse-1">
+    <ul class="nav navbar-nav">
+      <li><a href="/downloads.html">Download</a></li>
+      <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+          Libraries <b class="caret"></b>
+        </a>
+        <ul class="dropdown-menu">
+          <li><a href="/sql/">SQL and DataFrames</a></li>
+          <li><a href="/streaming/">Spark Streaming</a></li>
+          <li><a href="/mllib/">MLlib (machine learning)</a></li>
+          <li><a href="/graphx/">GraphX (graph)</a></li>
+          <li class="divider"></li>
+          <li><a href="/third-party-projects.html">Third-Party Projects</a></li>
+        </ul>
+      </li>
+      <li class="dropdown">
+        <a href="#" class="dropdown-toggle" data-toggle="dropdown">
+          Documentation <b class="caret"></b>
+        </a>
+        <ul class="dropdown-menu">
+          <li><a href="/docs/latest/">Latest Release (Spark 2.0.2)</a></li>
+          <li><a href="/documentation.html">Older Versions and Other Resources</a></li>
+        </ul>
+      </li>
+      <li><a href="/examples.html">Examples</a></li>
+      <li class="dropdown">
+        <a href="/community.html" class="dropdown-toggle" data-toggle="dropdown">
+          Community <b class="caret"></b>
+        </a>
+        <ul class="dropdown-menu">
+          <li><a href="/community.html#mailing-lists">Mailing Lists</a></li>
+          <li><a href="/contributing.html">Contributing to Spark</a></li>
+          <li><a href="https://issues.apache.org/jira/browse/SPARK">Issue Tracker</a></li>
+          <li><a href="/community.html#events">Events and Meetups</a></li>
+          <li><a href="/community.html#history">Project History</a></li>
+          <li><a href="/powered-by.html">Powered By</a></li>
+          <li><a href="/committers.html">Project Committers</a></li>
+        </ul>
+      </li>
+      <li><a href="/faq.html">FAQ</a></li>
+    </ul>
+    <ul class="nav navbar-nav navbar-right">
+      <li class="dropdown">
+        <a href="http://www.apache.org/" class="dropdown-toggle" data-toggle="dropdown">
+          Apache Software Foundation <b class="caret"></b></a>
+        <ul class="dropdown-menu">
+          <li><a href="http://www.apache.org/">Apache Homepage</a></li>
+          <li><a href="http://www.apache.org/licenses/">License</a></li>
+          <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
+          <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+          <li><a href="http://www.apache.org/security/">Security</a></li>
+        </ul>
+      </li>
+    </ul>
+  </div>
+  <!-- /.navbar-collapse -->
+</nav>
+
+
+<div class="row">
+  <div class="col-md-3 col-md-push-9">
+    <div class="news" style="margin-bottom: 20px;">
+      <h5>Latest News</h5>
+      <ul class="list-unstyled">
+        
+          <li><a href="/news/spark-wins-cloudsort-100tb-benchmark.html">Spark wins CloudSort Benchmark as the most efficient engine</a>
+          <span class="small">(Nov 15, 2016)</span></li>
+        
+          <li><a href="/news/spark-2-0-2-released.html">Spark 2.0.2 released</a>
+          <span class="small">(Nov 14, 2016)</span></li>
+        
+          <li><a href="/news/spark-1-6-3-released.html">Spark 1.6.3 released</a>
+          <span class="small">(Nov 07, 2016)</span></li>
+        
+          <li><a href="/news/spark-2-0-1-released.html">Spark 2.0.1 released</a>
+          <span class="small">(Oct 03, 2016)</span></li>
+        
+      </ul>
+      <p class="small" style="text-align: right;"><a href="/news/index.html">Archive</a></p>
+    </div>
+    <div class="hidden-xs hidden-sm">
+      <a href="/downloads.html" class="btn btn-success btn-lg btn-block" style="margin-bottom: 30px;">
+        Download Spark
+      </a>
+      <p style="font-size: 16px; font-weight: 500; color: #555;">
+        Built-in Libraries:
+      </p>
+      <ul class="list-none">
+        <li><a href="/sql/">SQL and DataFrames</a></li>
+        <li><a href="/streaming/">Spark Streaming</a></li>
+        <li><a href="/mllib/">MLlib (machine learning)</a></li>
+        <li><a href="/graphx/">GraphX (graph)</a></li>
+      </ul>
+      <a href="/third-party-projects.html">Third-Party Projects</a>
+    </div>
+  </div>
+
+  <div class="col-md-9 col-md-pull-3">
+    <h2>Current Committers</h2>
+
+<table>
+  <thead>
+    <tr>
+      <th>Name</th>
+      <th>Organization</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>Michael Armbrust</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Joseph Bradley</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Felix Cheung</td>
+      <td>Automattic</td>
+    </tr>
+    <tr>
+      <td>Mosharaf Chowdhury</td>
+      <td>University of Michigan, Ann Arbor</td>
+    </tr>
+    <tr>
+      <td>Jason Dai</td>
+      <td>Intel</td>
+    </tr>
+    <tr>
+      <td>Tathagata Das</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Ankur Dave</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Aaron Davidson</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Thomas Dudziak</td>
+      <td>Facebook</td>
+    </tr>
+    <tr>
+      <td>Robert Evans</td>
+      <td>Yahoo!</td>
+    </tr>
+    <tr>
+      <td>Wenchen Fan</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Joseph Gonzalez</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Thomas Graves</td>
+      <td>Yahoo!</td>
+    </tr>
+    <tr>
+      <td>Stephen Haberman</td>
+      <td>Bizo</td>
+    </tr>
+    <tr>
+      <td>Mark Hamstra</td>
+      <td>ClearStory Data</td>
+    </tr>
+    <tr>
+      <td>Herman van Hovell</td>
+      <td>QuestTec B.V.</td>
+    </tr>
+    <tr>
+      <td>Yin Huai</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Shane Huang</td>
+      <td>Intel</td>
+    </tr>
+    <tr>
+      <td>Andy Konwinski</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Ryan LeCompte</td>
+      <td>Quantifind</td>
+    </tr>
+    <tr>
+      <td>Haoyuan Li</td>
+      <td>Alluxio, UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Xiao Li</td>
+      <td>IBM</td>
+    </tr>
+    <tr>
+      <td>Davies Liu</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Cheng Lian</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Yanbo Liang</td>
+      <td>Hortonworks</td>
+    </tr>
+    <tr>
+      <td>Sean McNamara</td>
+      <td>Webtrends</td>
+    </tr>
+    <tr>
+      <td>Xiangrui Meng</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Mridul Muralidharam</td>
+      <td>Hortonworks</td>
+    </tr>
+    <tr>
+      <td>Andrew Or</td>
+      <td>Princeton University</td>
+    </tr>
+    <tr>
+      <td>Kay Ousterhout</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Sean Owen</td>
+      <td>Cloudera</td>
+    </tr>
+    <tr>
+      <td>Nick Pentreath</td>
+      <td>IBM</td>
+    </tr>
+    <tr>
+      <td>Imran Rashid</td>
+      <td>Cloudera</td>
+    </tr>
+    <tr>
+      <td>Charles Reiss</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Josh Rosen</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Sandy Ryza</td>
+      <td>Clover Health</td>
+    </tr>
+    <tr>
+      <td>Kousuke Saruta</td>
+      <td>NTT Data</td>
+    </tr>
+    <tr>
+      <td>Prashant Sharma</td>
+      <td>IBM</td>
+    </tr>
+    <tr>
+      <td>Ram Sriharsha</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>DB Tsai</td>
+      <td>Netflix</td>
+    </tr>
+    <tr>
+      <td>Marcelo Vanzin</td>
+      <td>Cloudera</td>
+    </tr>
+    <tr>
+      <td>Shivaram Venkataraman</td>
+      <td>UC Berkeley</td>
+    </tr>
+    <tr>
+      <td>Patrick Wendell</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Andrew Xia</td>
+      <td>Alibaba</td>
+    </tr>
+    <tr>
+      <td>Reynold Xin</td>
+      <td>Databricks</td>
+    </tr>
+    <tr>
+      <td>Matei Zaharia</td>
+      <td>Databricks, Stanford</td>
+    </tr>
+    <tr>
+      <td>Shixiong Zhu</td>
+      <td>Databricks</td>
+    </tr>
+  </tbody>
+</table>
+
+<h3>Becoming a Committer</h3>
+
+<p>To get started contributing to Spark, learn 
+<a href="/contributing.html">how to contribute</a> – 
+anyone can submit patches, documentation and examples to the project.</p>
+
+<p>The PMC regularly adds new committers from the active contributors, based on their contributions 
+to Spark. The qualifications for new committers include:</p>
+
+<ol>
+  <li>Sustained contributions to Spark: Committers should have a history of major contributions to 
+Spark. An ideal committer will have contributed broadly throughout the project, and have 
+contributed at least one major component where they have taken an &#8220;ownership&#8221; role. An ownership 
+role means that existing contributors feel that they should run patches for this component by 
+this person.</li>
+  <li>Quality of contributions: Committers more than any other community member should submit simple, 
+well-tested, and well-designed patches. In addition, they should show sufficient expertise to be 
+able to review patches, including making sure they fit within Spark&#8217;s engineering practices 
+(testability, documentation, API stability, code style, etc). The committership is collectively 
+responsible for the software quality and maintainability of Spark.</li>
+  <li>Community involvement: Committers should have a constructive and friendly attitude in all 
+community interactions. They should also be active on the dev and user list and help mentor 
+newer contributors and users. In design discussions, committers should maintain a professional 
+and diplomatic approach, even in the face of disagreement.</li>
+</ol>
+
+<p>The type and level of contributions considered may vary by project area &#8211; for example, we 
+greatly encourage contributors who want to work on mainly the documentation, or mainly on 
+platform support for specific OSes, storage systems, etc.</p>
+
+<h3>Review Process</h3>
+
+<p>All contributions should be reviewed before merging as described in 
+<a href="/contributing.html">Contributing to Spark</a>. 
+In particular, if you are working on an area of the codebase you are unfamiliar with, look at the 
+Git history for that code to see who reviewed patches before. You can do this using 
+<code>git log --format=full &lt;filename&gt;</code>, by examining the &#8220;Commit&#8221; field to see who committed each patch.</p>
+
+<h3>How to Merge a Pull Request</h3>
+
+<p>Changes pushed to the master branch on Apache cannot be removed; that is, we can&#8217;t force-push to 
+it. So please don&#8217;t add any test commits or anything like that, only real patches.</p>
+
+<p>All merges should be done using the 
+<a href="https://github.com/apache/spark/blob/master/dev/merge_spark_pr.py">dev/merge_spark_pr.py</a> 
+script, which squashes the pull request&#8217;s changes into one commit. To use this script, you 
+will need to add a git remote called &#8220;apache&#8221; at https://git-wip-us.apache.org/repos/asf/spark.git, 
+as well as one called &#8220;apache-github&#8221; at <code>git://github.com/apache/spark</code>. For the <code>apache</code> repo, 
+you can authenticate using your ASF username and password. Ask Patrick if you have trouble with 
+this or want help doing your first merge.</p>
+
+<p>The script is fairly self explanatory and walks you through steps and options interactively.</p>
+
+<p>If you want to amend a commit before merging – which should be used for trivial touch-ups – 
+then simply let the script wait at the point where it asks you if you want to push to Apache. 
+Then, in a separate window, modify the code and push a commit. Run <code>git rebase -i HEAD~2</code> and 
+&#8220;squash&#8221; your new commit. Edit the commit message just after to remove your commit message. 
+You can verify the result is one change with <code>git log</code>. Then resume the script in the other window.</p>
+
+<p>Also, please remember to set Assignee on JIRAs where applicable when they are resolved. The script 
+can&#8217;t do this automatically.</p>
+
+<!--
+<h3>Minimize use of MINOR, BUILD, and HOTFIX with no JIRA</h3>
+
+From pwendell at https://www.mail-archive.com/dev@spark.apache.org/msg09565.html:
+It would be great if people could create JIRA's for any and all merged pull requests. The reason is 
+that when patches get reverted due to build breaks or other issues, it is very difficult to keep 
+track of what is going on if there is no JIRA. 
+Here is a list of 5 patches we had to revert recently that didn't include a JIRA:
+    Revert "[MINOR] [BUILD] Use custom temp directory during build."
+    Revert "[SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior"
+    Revert "[BUILD] Always run SQL tests in master build."
+    Revert "[MINOR] [CORE] Warn users who try to cache RDDs with dynamic allocation on."
+    Revert "[HOT FIX] [YARN] Check whether `/lib` exists before listing its files"
+
+The cost overhead of creating a JIRA relative to other aspects of development is very small. 
+If it's really a documentation change or something small, that's okay.
+
+But anything affecting the build, packaging, etc. These all need to have a JIRA to ensure that 
+follow-up can be well communicated to all Spark developers.
+-->
+
+<h3>Policy on Backporting Bug Fixes</h3>
+
+<p>From <a href="https://www.mail-archive.com/dev@spark.apache.org/msg10284.html"><code>pwendell</code></a>:</p>
+
+<p>The trade off when backporting is you get to deliver the fix to people running older versions 
+(great!), but you risk introducing new or even worse bugs in maintenance releases (bad!). 
+The decision point is when you have a bug fix and it&#8217;s not clear whether it is worth backporting.</p>
+
+<p>I think the following facets are important to consider:</p>
+<ul>
+  <li>Backports are an extremely valuable service to the community and should be considered for 
+any bug fix.</li>
+  <li>Introducing a new bug in a maintenance release must be avoided at all costs. It over time would 
+erode confidence in our release process.</li>
+  <li>Distributions or advanced users can always backport risky patches on their own, if they see fit.</li>
+</ul>
+
+<p>For me, the consequence of these is that we should backport in the following situations:</p>
+<ul>
+  <li>Both the bug and the fix are well understood and isolated. Code being modified is well tested.</li>
+  <li>The bug being addressed is high priority to the community.</li>
+  <li>The backported fix does not vary widely from the master branch fix.</li>
+</ul>
+
+<p>We tend to avoid backports in the converse situations:</p>
+<ul>
+  <li>The bug or fix are not well understood. For instance, it relates to interactions between complex 
+components or third party libraries (e.g. Hadoop libraries). The code is not well tested outside 
+of the immediate bug being fixed.</li>
+  <li>The bug is not clearly a high priority for the community.</li>
+  <li>The backported fix is widely different from the master branch fix.</li>
+</ul>
+
+  </div>
+</div>
+
+
+
+<footer class="small">
+  <hr>
+  Apache Spark, Spark, Apache, and the Spark logo are <a href="/trademarks.html">trademarks</a> of
+  <a href="http://www.apache.org">The Apache Software Foundation</a>.
+</footer>
+
+</div>
+
+</body>
+</html>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message