spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sro...@apache.org
Subject spark-website git commit: replace with valid url to rdd paper
Date Sat, 17 Sep 2016 23:23:44 GMT
Repository: spark-website
Updated Branches:
  refs/heads/asf-site a78faf582 -> eee58685c


replace with valid url to rdd paper


Project: http://git-wip-us.apache.org/repos/asf/spark-website/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark-website/commit/eee58685
Tree: http://git-wip-us.apache.org/repos/asf/spark-website/tree/eee58685
Diff: http://git-wip-us.apache.org/repos/asf/spark-website/diff/eee58685

Branch: refs/heads/asf-site
Commit: eee58685c39269c191a921c39f1520c747a42318
Parents: a78faf5
Author: Xin Ren <iamshrek@126.com>
Authored: Fri Sep 16 16:31:23 2016 -0700
Committer: Xin Ren <iamshrek@126.com>
Committed: Fri Sep 16 16:31:23 2016 -0700

----------------------------------------------------------------------
 research.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark-website/blob/eee58685/research.md
----------------------------------------------------------------------
diff --git a/research.md b/research.md
index 41841a1..ec7dd54 100644
--- a/research.md
+++ b/research.md
@@ -27,7 +27,7 @@ Traditional MapReduce and DAG engines are suboptimal for these applications
beca
 </p>
 
 <p>
-Spark offers an abstraction called <a href="http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf"><em>resilient
distributed datasets (RDDs)</em></a> to support these applications efficiently.
RDDs can be stored in memory between queries <em>without</em> requiring replication.
 Instead, they rebuild lost data on failure using <em>lineage</em>: each RDD remembers
how it was built from other datasets (by transformations like <code>map</code>,
<code>join</code> or <code>groupBy</code>) to rebuild itself.  RDDs
allow Spark to outperform existing models by up to 100x in multi-pass analytics. We showed
that RDDs can support a wide variety of iterative algorithms, as well as interactive data
mining and a highly efficient SQL engine (<a href="http://shark.cs.berkeley.edu">Shark</a>).
+Spark offers an abstraction called <a href="http://people.csail.mit.edu/matei/papers/2012/nsdi_spark.pdf"><em>resilient
distributed datasets (RDDs)</em></a> to support these applications efficiently.
RDDs can be stored in memory between queries <em>without</em> requiring replication.
 Instead, they rebuild lost data on failure using <em>lineage</em>: each RDD remembers
how it was built from other datasets (by transformations like <code>map</code>,
<code>join</code> or <code>groupBy</code>) to rebuild itself.  RDDs
allow Spark to outperform existing models by up to 100x in multi-pass analytics. We showed
that RDDs can support a wide variety of iterative algorithms, as well as interactive data
mining and a highly efficient SQL engine (<a href="http://shark.cs.berkeley.edu">Shark</a>).
 </p>
 
 <p class="noskip">You can find more about the research behind Spark in the following
papers:</p>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message