spark-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shiva...@apache.org
Subject spark git commit: [SPARK-18643][SPARKR] SparkR hangs at session start when installed as a package without Spark
Date Mon, 05 Dec 2016 04:25:17 GMT
Repository: spark
Updated Branches:
  refs/heads/master d9eb4c721 -> b019b3a8a


[SPARK-18643][SPARKR] SparkR hangs at session start when installed as a package without Spark

## What changes were proposed in this pull request?

If SparkR is running as a package and it has previously downloaded Spark Jar it should be
able to run as before without having to set SPARK_HOME. Basically with this bug the auto install
Spark will only work in the first session.

This seems to be a regression on the earlier behavior.

Fix is to always try to install or check for the cached Spark if running in an interactive
session.
As discussed before, we should probably only install Spark iff running in an interactive session
(R shell, RStudio etc)

## How was this patch tested?

Manually

Author: Felix Cheung <felixcheung_m@hotmail.com>

Closes #16077 from felixcheung/rsessioninteractive.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b019b3a8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b019b3a8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b019b3a8

Branch: refs/heads/master
Commit: b019b3a8ac49336e657f5e093fa2fba77f8d12d2
Parents: d9eb4c7
Author: Felix Cheung <felixcheung_m@hotmail.com>
Authored: Sun Dec 4 20:25:11 2016 -0800
Committer: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
Committed: Sun Dec 4 20:25:11 2016 -0800

----------------------------------------------------------------------
 R/pkg/R/sparkR.R                     | 5 ++++-
 R/pkg/vignettes/sparkr-vignettes.Rmd | 4 ++--
 docs/sparkr.md                       | 4 +++-
 3 files changed, 9 insertions(+), 4 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/b019b3a8/R/pkg/R/sparkR.R
----------------------------------------------------------------------
diff --git a/R/pkg/R/sparkR.R b/R/pkg/R/sparkR.R
index a7152b4..43bff97 100644
--- a/R/pkg/R/sparkR.R
+++ b/R/pkg/R/sparkR.R
@@ -322,6 +322,9 @@ sparkRHive.init <- function(jsc = NULL) {
 #' SparkSession or initializes a new SparkSession.
 #' Additional Spark properties can be set in \code{...}, and these named parameters take
priority
 #' over values in \code{master}, \code{appName}, named lists of \code{sparkConfig}.
+#' When called in an interactive session, this checks for the Spark installation, and, if
not
+#' found, it will be downloaded and cached automatically. Alternatively, \code{install.spark}
can
+#' be called manually.
 #'
 #' For details on how to initialize and use SparkR, refer to SparkR programming guide at
 #' \url{http://spark.apache.org/docs/latest/sparkr.html#starting-up-sparksession}.
@@ -565,7 +568,7 @@ sparkCheckInstall <- function(sparkHome, master, deployMode) {
       message(msg)
       NULL
     } else {
-      if (isMasterLocal(master)) {
+      if (interactive() || isMasterLocal(master)) {
         msg <- paste0("Spark not found in SPARK_HOME: ", sparkHome)
         message(msg)
         packageLocalDir <- install.spark()

http://git-wip-us.apache.org/repos/asf/spark/blob/b019b3a8/R/pkg/vignettes/sparkr-vignettes.Rmd
----------------------------------------------------------------------
diff --git a/R/pkg/vignettes/sparkr-vignettes.Rmd b/R/pkg/vignettes/sparkr-vignettes.Rmd
index 73a5e26..a36f8fc 100644
--- a/R/pkg/vignettes/sparkr-vignettes.Rmd
+++ b/R/pkg/vignettes/sparkr-vignettes.Rmd
@@ -94,13 +94,13 @@ sparkR.session.stop()
 
 Different from many other R packages, to use SparkR, you need an additional installation
of Apache Spark. The Spark installation will be used to run a backend process that will compile
and execute SparkR programs.
 
-If you don't have Spark installed on the computer, you may download it from [Apache Spark
Website](http://spark.apache.org/downloads.html). Alternatively, we provide an easy-to-use
function `install.spark` to complete this process. You don't have to call it explicitly. We
will check the installation when `sparkR.session` is called and `install.spark` function will
be  triggered automatically if no installation is found.
+After installing the SparkR package, you can call `sparkR.session` as explained in the previous
section to start and it will check for the Spark installation. If you are working with SparkR
from an interactive shell (eg. R, RStudio) then Spark is downloaded and cached automatically
if it is not found. Alternatively, we provide an easy-to-use function `install.spark` for
running this manually. If you don't have Spark installed on the computer, you may download
it from [Apache Spark Website](http://spark.apache.org/downloads.html).
 
 ```{r, eval=FALSE}
 install.spark()
 ```
 
-If you already have Spark installed, you don't have to install again and can pass the `sparkHome`
argument to `sparkR.session` to let SparkR know where the Spark installation is.
+If you already have Spark installed, you don't have to install again and can pass the `sparkHome`
argument to `sparkR.session` to let SparkR know where the existing Spark installation is.
 
 ```{r, eval=FALSE}
 sparkR.session(sparkHome = "/HOME/spark")

http://git-wip-us.apache.org/repos/asf/spark/blob/b019b3a8/docs/sparkr.md
----------------------------------------------------------------------
diff --git a/docs/sparkr.md b/docs/sparkr.md
index d269492..60cd01a 100644
--- a/docs/sparkr.md
+++ b/docs/sparkr.md
@@ -40,7 +40,9 @@ sparkR.session()
 You can also start SparkR from RStudio. You can connect your R program to a Spark cluster
from
 RStudio, R shell, Rscript or other R IDEs. To start, make sure SPARK_HOME is set in environment
 (you can check [Sys.getenv](https://stat.ethz.ch/R-manual/R-devel/library/base/html/Sys.getenv.html)),
-load the SparkR package, and call `sparkR.session` as below. In addition to calling `sparkR.session`,
+load the SparkR package, and call `sparkR.session` as below. It will check for the Spark
installation, and, if not found, it will be downloaded and cached automatically. Alternatively,
you can also run `install.spark` manually.
+
+In addition to calling `sparkR.session`,
  you could also specify certain Spark driver properties. Normally these
 [Application properties](configuration.html#application-properties) and
 [Runtime Environment](configuration.html#runtime-environment) cannot be set programmatically,
as the


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org


Mime
View raw message