Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6597C200A5B for ; Wed, 25 May 2016 10:55:18 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6456F160A18; Wed, 25 May 2016 08:55:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 169D7160A17 for ; Wed, 25 May 2016 10:55:15 +0200 (CEST) Received: (qmail 42092 invoked by uid 500); 25 May 2016 08:55:15 -0000 Mailing-List: contact commits-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: ambari-dev@ambari.apache.org Delivered-To: mailing list commits@ambari.apache.org Received: (qmail 42082 invoked by uid 99); 25 May 2016 08:55:15 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 25 May 2016 08:55:15 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 1CAF4DFBDE; Wed, 25 May 2016 08:55:15 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: jluniya@apache.org To: commits@ambari.apache.org Message-Id: <56cf6cf7f464404fb4464cf1e2f1538a@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: ambari git commit: AMBARI-16753: Spark2 service definition for Ambari (Saisai Shao via jluniya) Date: Wed, 25 May 2016 08:55:15 +0000 (UTC) archived-at: Wed, 25 May 2016 08:55:18 -0000 Repository: ambari Updated Branches: refs/heads/trunk f19904109 -> 27168a13f AMBARI-16753: Spark2 service definition for Ambari (Saisai Shao via jluniya) Project: http://git-wip-us.apache.org/repos/asf/ambari/repo Commit: http://git-wip-us.apache.org/repos/asf/ambari/commit/27168a13 Tree: http://git-wip-us.apache.org/repos/asf/ambari/tree/27168a13 Diff: http://git-wip-us.apache.org/repos/asf/ambari/diff/27168a13 Branch: refs/heads/trunk Commit: 27168a13f0aff9faa37d789ee5efb323c98e6856 Parents: f199041 Author: Jayush Luniya Authored: Wed May 25 01:55:06 2016 -0700 Committer: Jayush Luniya Committed: Wed May 25 01:55:06 2016 -0700 ---------------------------------------------------------------------- .../common-services/SPARK2/2.0.0/alerts.json | 32 +++ .../2.0.0/configuration/spark2-defaults.xml | 94 ++++++++ .../SPARK2/2.0.0/configuration/spark2-env.xml | 121 ++++++++++ .../configuration/spark2-hive-site-override.xml | 55 +++++ .../configuration/spark2-log4j-properties.xml | 46 ++++ .../configuration/spark2-metrics-properties.xml | 164 ++++++++++++++ .../spark2-thrift-fairscheduler.xml | 37 ++++ .../configuration/spark2-thrift-sparkconf.xml | 140 ++++++++++++ .../common-services/SPARK2/2.0.0/kerberos.json | 64 ++++++ .../common-services/SPARK2/2.0.0/metainfo.xml | 218 +++++++++++++++++++ .../2.0.0/package/scripts/job_history_server.py | 103 +++++++++ .../SPARK2/2.0.0/package/scripts/params.py | 197 +++++++++++++++++ .../2.0.0/package/scripts/service_check.py | 43 ++++ .../SPARK2/2.0.0/package/scripts/setup_spark.py | 108 +++++++++ .../2.0.0/package/scripts/spark_client.py | 61 ++++++ .../2.0.0/package/scripts/spark_service.py | 141 ++++++++++++ .../package/scripts/spark_thrift_server.py | 87 ++++++++ .../2.0.0/package/scripts/status_params.py | 39 ++++ .../SPARK2/2.0.0/quicklinks/quicklinks.json | 27 +++ .../stacks/HDP/2.5/services/SPARK2/metainfo.xml | 30 +++ 20 files changed, 1807 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/alerts.json ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/alerts.json b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/alerts.json new file mode 100755 index 0000000..593ebb7 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/alerts.json @@ -0,0 +1,32 @@ +{ + "SPARK": { + "service": [], + "SPARK2_JOBHISTORYSERVER": [ + { + "name": "SPARK2_JOBHISTORYSERVER_PROCESS", + "label": "Spark2 History Server", + "description": "This host-level alert is triggered if the Spark2 History Server cannot be determined to be up.", + "interval": 1, + "scope": "HOST", + "source": { + "type": "PORT", + "uri": "{{spark2-defaults/spark.history.ui.port}}", + "default_port": 18081, + "reporting": { + "ok": { + "text": "TCP OK - {0:.3f}s response on port {1}" + }, + "warning": { + "text": "TCP OK - {0:.3f}s response on port {1}", + "value": 1.5 + }, + "critical": { + "text": "Connection failed: {0} to {1}:{2}", + "value": 5 + } + } + } + } + ] + } +} http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-defaults.xml ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-defaults.xml b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-defaults.xml new file mode 100755 index 0000000..5d6c781 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-defaults.xml @@ -0,0 +1,94 @@ + + + + + + + spark.yarn.queue + default + + The name of the YARN queue to which the application is submitted. + + + + + spark.history.provider + org.apache.spark.deploy.history.FsHistoryProvider + + Name of history provider + + + + + spark.history.ui.port + 18081 + + The port to which the web interface of the History Server binds. + + + + + spark.history.fs.logDirectory + hdfs:///spark2-history/ + + Base directory for history spark application log. + + + + + spark.history.kerberos.principal + none + + Kerberos principal name for the Spark History Server. + + + + + spark.history.kerberos.keytab + none + + Location of the kerberos keytab file for the Spark History Server. + + + + + spark.eventLog.enabled + true + + Whether to log Spark events, useful for reconstructing the Web UI after the application has finished. + + + + + spark.eventLog.dir + hdfs:///spark2-history/ + + Base directory in which Spark events are logged, if spark.eventLog.enabled is true. + + + + + spark.yarn.historyServer.address + {{spark_history_server_host}}:{{spark_history_ui_port}} + The address of the Spark history server (i.e. host.com:18081). The address should not contain a scheme (http://). Defaults to not being set since the history server is an optional service. This address is given to the YARN ResourceManager when the Spark application finishes to link the application from the ResourceManager UI to the Spark history server UI. + + + http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-env.xml ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-env.xml b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-env.xml new file mode 100755 index 0000000..fbde5c5 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-env.xml @@ -0,0 +1,121 @@ + + + + + + + spark_user + Spark User + spark + USER + + user + false + + + + + spark_group + Spark Group + spark + GROUP + spark group + + user + + + + + spark_log_dir + /var/log/spark2 + Spark Log Dir + + directory + + + + + spark_pid_dir + /var/run/spark2 + + directory + + + + + + content + This is the jinja template for spark-env.sh file + +#!/usr/bin/env bash + +# This file is sourced when running various Spark programs. +# Copy it as spark-env.sh and edit that to configure Spark for your site. + +# Options read in YARN client mode +#SPARK_EXECUTOR_INSTANCES="2" #Number of workers to start (Default: 2) +#SPARK_EXECUTOR_CORES="1" #Number of cores for the workers (Default: 1). +#SPARK_EXECUTOR_MEMORY="1G" #Memory per Worker (e.g. 1000M, 2G) (Default: 1G) +#SPARK_DRIVER_MEMORY="512Mb" #Memory for Master (e.g. 1000M, 2G) (Default: 512 Mb) +#SPARK_YARN_APP_NAME="spark" #The name of your application (Default: Spark) +#SPARK_YARN_QUEUE="default" #The hadoop queue to use for allocation requests (Default: default) +#SPARK_YARN_DIST_FILES="" #Comma separated list of files to be distributed with the job. +#SPARK_YARN_DIST_ARCHIVES="" #Comma separated list of archives to be distributed with the job. + +# Generic options for the daemons used in the standalone deploy mode + +# Alternate conf dir. (Default: ${SPARK_HOME}/conf) +export SPARK_CONF_DIR=${SPARK_CONF_DIR:-{{spark_home}}/conf} + +# Where log files are stored.(Default:${SPARK_HOME}/logs) +#export SPARK_LOG_DIR=${SPARK_HOME:-{{spark_home}}}/logs +export SPARK_LOG_DIR={{spark_log_dir}} + +# Where the pid file is stored. (Default: /tmp) +export SPARK_PID_DIR={{spark_pid_dir}} + +# A string representing this instance of spark.(Default: $USER) +SPARK_IDENT_STRING=$USER + +# The scheduling priority for daemons. (Default: 0) +SPARK_NICENESS=0 + +export HADOOP_HOME=${HADOOP_HOME:-{{hadoop_home}}} +export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-{{hadoop_conf_dir}}} + +# The java implementation to use. +export JAVA_HOME={{java_home}} + + + + content + + + + + spark_thrift_cmd_opts + additional spark thrift server commandline options + + + true + + + http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-hive-site-override.xml ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-hive-site-override.xml b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-hive-site-override.xml new file mode 100755 index 0000000..e5cc377 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-hive-site-override.xml @@ -0,0 +1,55 @@ + + + + + + + hive.server2.enable.doAs + false + + Disable impersonation in Hive Server 2. + + + + hive.metastore.client.socket.timeout + 1800 + MetaStore Client socket timeout in seconds + + + hive.metastore.client.connect.retry.delay + 5 + + Expects a time value - number of seconds for the client to wait between consecutive connection attempts + + + + hive.server2.thrift.port + 10016 + + TCP port number to listen on, default 10015. + + + + hive.server2.transport.mode + binary + + Expects one of [binary, http]. + Transport mode of HiveServer2. + + + http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-log4j-properties.xml ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-log4j-properties.xml b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-log4j-properties.xml new file mode 100755 index 0000000..3f385b4 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-log4j-properties.xml @@ -0,0 +1,46 @@ + + + + + + content + Spark2-log4j-Properties + +# Set everything to be logged to the console +log4j.rootCategory=INFO, console +log4j.appender.console=org.apache.log4j.ConsoleAppender +log4j.appender.console.target=System.err +log4j.appender.console.layout=org.apache.log4j.PatternLayout +log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n + +# Settings to quiet third party logs that are too verbose +log4j.logger.org.eclipse.jetty=WARN +log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR +log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO +log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO + + + + content + false + + + http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-metrics-properties.xml ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-metrics-properties.xml b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-metrics-properties.xml new file mode 100755 index 0000000..c215548 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-metrics-properties.xml @@ -0,0 +1,164 @@ + + + + + content + Spark2-metrics-properties + +# syntax: [instance].sink|source.[name].[options]=[value] + +# This file configures Spark's internal metrics system. The metrics system is +# divided into instances which correspond to internal components. +# Each instance can be configured to report its metrics to one or more sinks. +# Accepted values for [instance] are "master", "worker", "executor", "driver", +# and "applications". A wild card "*" can be used as an instance name, in +# which case all instances will inherit the supplied property. +# +# Within an instance, a "source" specifies a particular set of grouped metrics. +# there are two kinds of sources: +# 1. Spark internal sources, like MasterSource, WorkerSource, etc, which will +# collect a Spark component's internal state. Each instance is paired with a +# Spark source that is added automatically. +# 2. Common sources, like JvmSource, which will collect low level state. +# These can be added through configuration options and are then loaded +# using reflection. +# +# A "sink" specifies where metrics are delivered to. Each instance can be +# assigned one or more sinks. +# +# The sink|source field specifies whether the property relates to a sink or +# source. +# +# The [name] field specifies the name of source or sink. +# +# The [options] field is the specific property of this source or sink. The +# source or sink is responsible for parsing this property. +# +# Notes: +# 1. To add a new sink, set the "class" option to a fully qualified class +# name (see examples below). +# 2. Some sinks involve a polling period. The minimum allowed polling period +# is 1 second. +# 3. Wild card properties can be overridden by more specific properties. +# For example, master.sink.console.period takes precedence over +# *.sink.console.period. +# 4. A metrics specific configuration +# "spark.metrics.conf=${SPARK_HOME}/conf/metrics.properties" should be +# added to Java properties using -Dspark.metrics.conf=xxx if you want to +# customize metrics system. You can also put the file in ${SPARK_HOME}/conf +# and it will be loaded automatically. +# 5. MetricsServlet is added by default as a sink in master, worker and client +# driver, you can send http request "/metrics/json" to get a snapshot of all the +# registered metrics in json format. For master, requests "/metrics/master/json" and +# "/metrics/applications/json" can be sent seperately to get metrics snapshot of +# instance master and applications. MetricsServlet may not be configured by self. +# + +## List of available sinks and their properties. + +# org.apache.spark.metrics.sink.ConsoleSink +# Name: Default: Description: +# period 10 Poll period +# unit seconds Units of poll period + +# org.apache.spark.metrics.sink.CSVSink +# Name: Default: Description: +# period 10 Poll period +# unit seconds Units of poll period +# directory /tmp Where to store CSV files + +# org.apache.spark.metrics.sink.GangliaSink +# Name: Default: Description: +# host NONE Hostname or multicast group of Ganglia server +# port NONE Port of Ganglia server(s) +# period 10 Poll period +# unit seconds Units of poll period +# ttl 1 TTL of messages sent by Ganglia +# mode multicast Ganglia network mode ('unicast' or 'multicast') + +# org.apache.spark.metrics.sink.JmxSink + +# org.apache.spark.metrics.sink.MetricsServlet +# Name: Default: Description: +# path VARIES* Path prefix from the web server root +# sample false Whether to show entire set of samples for histograms ('false' or 'true') +# +# * Default path is /metrics/json for all instances except the master. The master has two paths: +# /metrics/aplications/json # App information +# /metrics/master/json # Master information + +# org.apache.spark.metrics.sink.GraphiteSink +# Name: Default: Description: +# host NONE Hostname of Graphite server +# port NONE Port of Graphite server +# period 10 Poll period +# unit seconds Units of poll period +# prefix EMPTY STRING Prefix to prepend to metric name + +## Examples +# Enable JmxSink for all instances by class name +#*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink + +# Enable ConsoleSink for all instances by class name +#*.sink.console.class=org.apache.spark.metrics.sink.ConsoleSink + +# Polling period for ConsoleSink +#*.sink.console.period=10 + +#*.sink.console.unit=seconds + +# Master instance overlap polling period +#master.sink.console.period=15 + +#master.sink.console.unit=seconds + +# Enable CsvSink for all instances +#*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink + +# Polling period for CsvSink +#*.sink.csv.period=1 + +#*.sink.csv.unit=minutes + +# Polling directory for CsvSink +#*.sink.csv.directory=/tmp/ + +# Worker instance overlap polling period +#worker.sink.csv.period=10 + +#worker.sink.csv.unit=minutes + +# Enable jvm source for instance master, worker, driver and executor +#master.source.jvm.class=org.apache.spark.metrics.source.JvmSource + +#worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource + +#driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource + +#executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource + + + + content + false + + + http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-thrift-fairscheduler.xml ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-thrift-fairscheduler.xml b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-thrift-fairscheduler.xml new file mode 100755 index 0000000..2dda4bb --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-thrift-fairscheduler.xml @@ -0,0 +1,37 @@ + + + + + + + fairscheduler_content + This is the jinja template for spark-thrift-fairscheduler.xml file. + <?xml version="1.0"?> + <allocations> + <pool name="default"> + <schedulingMode>FAIR</schedulingMode> + <weight>1</weight> + <minShare>2</minShare> + </pool> + </allocations> + + + content + + + \ No newline at end of file http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-thrift-sparkconf.xml ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-thrift-sparkconf.xml b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-thrift-sparkconf.xml new file mode 100755 index 0000000..ce1d159 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/configuration/spark2-thrift-sparkconf.xml @@ -0,0 +1,140 @@ + + + + + + + spark.yarn.queue + default + + The name of the YARN queue to which the application is submitted. + + + + + spark.history.provider + org.apache.spark.deploy.history.FsHistoryProvider + Name of history provider class + + + + spark.history.fs.logDirectory + {{spark_history_dir}} + true + + Base directory for history spark application log. It is the same value + as in spark-defaults.xml. + + + + + spark.eventLog.enabled + true + true + + Whether to log Spark events, useful for reconstructing the Web UI after the application has finished. + + + + + spark.eventLog.dir + {{spark_history_dir}} + true + + Base directory in which Spark events are logged, if spark.eventLog.enabled is true. It is the same value + as in spark-defaults.xml. + + + + + spark.master + {{spark_thrift_master}} + + The deploying mode of spark application, by default it is yarn-client for thrift-server but local mode for there's + only one nodemanager. + + + + + spark.scheduler.allocation.file + {{spark_conf}}/spark-thrift-fairscheduler.xml + + Scheduler configuration file for thriftserver. + + + + + spark.scheduler.mode + FAIR + + The scheduling mode between jobs submitted to the same SparkContext. + + + + + spark.shuffle.service.enabled + true + + Enables the external shuffle service. + + + + + spark.hadoop.cacheConf + false + + Specifies whether HadoopRDD caches the Hadoop configuration object + + + + + spark.dynamicAllocation.enabled + true + + Whether to use dynamic resource allocation, which scales the number of executors registered with this application up and down based on the workload. + + + + + spark.dynamicAllocation.initialExecutors + 0 + + Initial number of executors to run if dynamic allocation is enabled. + + + + + spark.dynamicAllocation.maxExecutors + 10 + + Upper bound for the number of executors if dynamic allocation is enabled. + + + + + spark.dynamicAllocation.minExecutors + 0 + + Lower bound for the number of executors if dynamic allocation is enabled. + + + + http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/kerberos.json ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/kerberos.json b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/kerberos.json new file mode 100755 index 0000000..967adb0 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/kerberos.json @@ -0,0 +1,64 @@ +{ + "services": [ + { + "name": "SPARK2", + "identities": [ + { + "name": "/smokeuser" + }, + { + "name": "sparkuser", + "principal": { + "value": "${spark2-env/spark_user}-${cluster_name|toLower()}@${realm}", + "type" : "user", + "configuration": "spark2-defaults/spark.history.kerberos.principal", + "local_username" : "${spark-env/spark_user}" + }, + "keytab": { + "file": "${keytab_dir}/spark.headless.keytab", + "owner": { + "name": "${spark2-env/spark_user}", + "access": "r" + }, + "group": { + "name": "${cluster-env/user_group}", + "access": "" + }, + "configuration": "spark2-defaults/spark.history.kerberos.keytab" + } + } + ], + "configurations": [ + { + "spark2-defaults": { + "spark.history.kerberos.enabled": "true" + } + } + ], + "components": [ + { + "name": "SPARK2_JOBHISTORYSERVER", + "identities": [ + { + "name": "/HDFS/NAMENODE/hdfs" + } + ] + }, + { + "name": "SPARK2_CLIENT" + }, + { + "name": "SPARK2_THRIFTSERVER", + "identities": [ + { + "name": "/HDFS/NAMENODE/hdfs" + }, + { + "name": "/HIVE/HIVE_SERVER/hive_server_hive" + } + ] + } + ] + } + ] +} http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/metainfo.xml ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/metainfo.xml b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/metainfo.xml new file mode 100755 index 0000000..98f1696 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/metainfo.xml @@ -0,0 +1,218 @@ + + + + 2.0 + + + SPARK2 + Spark2 + Apache Spark is a fast and general engine for large-scale data processing. + 2.0.0 + + + SPARK2_JOBHISTORYSERVER + Spark2 History Server + MASTER + 1 + true + + + HDFS/HDFS_CLIENT + host + + true + + + + MAPREDUCE2/MAPREDUCE2_CLIENT + host + + true + + + + YARN/YARN_CLIENT + host + + true + + + + + + PYTHON + 600 + + + + SPARK2_THRIFTSERVER + Spark2 Thrift Server + SLAVE + 0+ + true + + + HDFS/HDFS_CLIENT + host + + true + + + + MAPREDUCE2/MAPREDUCE2_CLIENT + host + + true + + + + YARN/YARN_CLIENT + host + + true + + + + HIVE/HIVE_METASTORE + cluster + + true + + + + + + PYTHON + 600 + + + + SPARK2_CLIENT + Spark2 Client + CLIENT + 1+ + true + + + HDFS/HDFS_CLIENT + host + + true + + + + MAPREDUCE2/MAPREDUCE2_CLIENT + host + + true + + + + YARN/YARN_CLIENT + host + + true + + + + + + PYTHON + 600 + + + + env + spark-log4j.properties + spark2-log4j-properties + + + env + spark-env.sh + spark2-env + + + env + spark-metrics.properties + spark2-metrics-properties + + + properties + spark-defaults.conf + spark2-defaults + + + + + + + spark2-defaults + spark2-env + spark2-log4j-properties + spark2-metrics-properties + spark2-thrift-sparkconf + spark2-hive-site-override + spark2-thrift-fairscheduler + + + + + PYTHON + 300 + + + + HDFS + YARN + HIVE + + + + + redhat7,amazon2015,redhat6,suse11,suse12 + + + spark2_${stack_version} + + + spark2_${stack_version}-python + + + + + debian7,ubuntu12,ubuntu14 + + + spark2-${stack_version} + + + spark2-${stack_version}-python + + + + + + + quicklinks.json + true + + + + + http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/job_history_server.py ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/job_history_server.py b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/job_history_server.py new file mode 100755 index 0000000..b3720f0 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/job_history_server.py @@ -0,0 +1,103 @@ +#!/usr/bin/python +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +""" + +import sys +import os + +from resource_management.libraries.script.script import Script +from resource_management.libraries.functions import conf_select +from resource_management.libraries.functions import stack_select +from resource_management.libraries.functions.copy_tarball import copy_to_hdfs +from resource_management.libraries.functions.check_process_status import check_process_status +from resource_management.libraries.functions.stack_features import check_stack_feature +from resource_management.libraries.functions import StackFeature +from resource_management.core.logger import Logger +from resource_management.core import shell +from setup_spark import * +from spark_service import spark_service + + +class JobHistoryServer(Script): + + def install(self, env): + import params + env.set_params(params) + + self.install_packages(env) + + def configure(self, env, upgrade_type=None): + import params + env.set_params(params) + + setup_spark(env, 'server', upgrade_type=upgrade_type, action = 'config') + + def start(self, env, upgrade_type=None): + import params + env.set_params(params) + + self.configure(env) + spark_service('jobhistoryserver', upgrade_type=upgrade_type, action='start') + + def stop(self, env, upgrade_type=None): + import params + env.set_params(params) + + spark_service('jobhistoryserver', upgrade_type=upgrade_type, action='stop') + + def status(self, env): + import status_params + env.set_params(status_params) + + check_process_status(status_params.spark_history_server_pid_file) + + + def get_component_name(self): + return "spark2-historyserver" + + def pre_upgrade_restart(self, env, upgrade_type=None): + import params + + env.set_params(params) + if params.version and check_stack_feature(StackFeature.ROLLING_UPGRADE, params.version): + Logger.info("Executing Spark2 Job History Server Stack Upgrade pre-restart") + conf_select.select(params.stack_name, "spark2", params.version) + stack_select.select("spark2-historyserver", params.version) + + # Spark 1.3.1.2.3, and higher, which was included in HDP 2.3, does not have a dependency on Tez, so it does not + # need to copy the tarball, otherwise, copy it. + if params.version and check_stack_feature(StackFeature.TEZ_FOR_SPARK, params.version): + resource_created = copy_to_hdfs( + "tez", + params.user_group, + params.hdfs_user, + host_sys_prepped=params.host_sys_prepped) + if resource_created: + params.HdfsResource(None, action="execute") + + def get_log_folder(self): + import params + return params.spark_log_dir + + def get_user(self): + import params + return params.spark_user + +if __name__ == "__main__": + JobHistoryServer().execute() http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/params.py ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/params.py b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/params.py new file mode 100755 index 0000000..ded9959 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/params.py @@ -0,0 +1,197 @@ +#!/usr/bin/python +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +""" + + +import status_params +from resource_management.libraries.functions.stack_features import check_stack_feature +from resource_management.libraries.functions import StackFeature +from setup_spark import * + +import resource_management.libraries.functions +from resource_management.libraries.functions import conf_select +from resource_management.libraries.functions import stack_select +from resource_management.libraries.functions import format +from resource_management.libraries.functions.get_stack_version import get_stack_version +from resource_management.libraries.functions.version import format_stack_version +from resource_management.libraries.functions.default import default +from resource_management.libraries.functions import get_kinit_path +from resource_management.libraries.functions.get_not_managed_resources import get_not_managed_resources + +from resource_management.libraries.script.script import Script + +# a map of the Ambari role to the component name +# for use with /current/ +SERVER_ROLE_DIRECTORY_MAP = { + 'SPARK2_JOBHISTORYSERVER' : 'spark2-historyserver', + 'SPARK2_CLIENT' : 'spark2-client', + 'SPARK2_THRIFTSERVER' : 'spark2-thriftserver' +} + +component_directory = Script.get_component_from_role(SERVER_ROLE_DIRECTORY_MAP, "SPARK2_CLIENT") + +config = Script.get_config() +tmp_dir = Script.get_tmp_dir() + +stack_name = status_params.stack_name +stack_root = Script.get_stack_root() +stack_version_unformatted = config['hostLevelParams']['stack_version'] +stack_version_formatted = format_stack_version(stack_version_unformatted) +host_sys_prepped = default("/hostLevelParams/host_sys_prepped", False) + +# New Cluster Stack Version that is defined during the RESTART of a Stack Upgrade +version = default("/commandParams/version", None) + +spark_conf = '/etc/spark2/conf' +hadoop_conf_dir = conf_select.get_hadoop_conf_dir() +hadoop_bin_dir = stack_select.get_hadoop_dir("bin") + +if stack_version_formatted and check_stack_feature(StackFeature.ROLLING_UPGRADE, stack_version_formatted): + hadoop_home = stack_select.get_hadoop_dir("home") + spark_conf = format("{stack_root}/current/{component_directory}/conf") + spark_log_dir = config['configurations']['spark2-env']['spark_log_dir'] + spark_pid_dir = status_params.spark_pid_dir + spark_home = format("{stack_root}/current/{component_directory}") + +spark_thrift_server_conf_file = spark_conf + "/spark-thrift-sparkconf.conf" +java_home = config['hostLevelParams']['java_home'] + +hdfs_user = config['configurations']['hadoop-env']['hdfs_user'] +hdfs_principal_name = config['configurations']['hadoop-env']['hdfs_principal_name'] +hdfs_user_keytab = config['configurations']['hadoop-env']['hdfs_user_keytab'] +user_group = config['configurations']['cluster-env']['user_group'] + +spark_user = status_params.spark_user +hive_user = status_params.hive_user +spark_group = status_params.spark_group +user_group = status_params.user_group +spark_hdfs_user_dir = format("/user/{spark_user}") +spark_history_dir = default('/configurations/spark2-defaults/spark.history.fs.logDirectory', "hdfs:///spark2-history") + +spark_history_server_pid_file = status_params.spark_history_server_pid_file +spark_thrift_server_pid_file = status_params.spark_thrift_server_pid_file + +spark_history_server_start = format("{spark_home}/sbin/start-history-server.sh") +spark_history_server_stop = format("{spark_home}/sbin/stop-history-server.sh") + +spark_thrift_server_start = format("{spark_home}/sbin/start-thriftserver.sh") +spark_thrift_server_stop = format("{spark_home}/sbin/stop-thriftserver.sh") + +run_example_cmd = format("{spark_home}/bin/run-example") +spark_smoke_example = "SparkPi" +spark_service_check_cmd = format( + "{run_example_cmd} --master yarn --deploy-mode cluster --num-executors 1 --driver-memory 256m --executor-memory 256m --executor-cores 1 {spark_smoke_example} 1") + +spark_jobhistoryserver_hosts = default("/clusterHostInfo/spark2_jobhistoryserver_hosts", []) + +if len(spark_jobhistoryserver_hosts) > 0: + spark_history_server_host = spark_jobhistoryserver_hosts[0] +else: + spark_history_server_host = "localhost" + +# spark-defaults params +spark_yarn_historyServer_address = default(spark_history_server_host, "localhost") + +spark_history_ui_port = config['configurations']['spark2-defaults']['spark.history.ui.port'] + +spark_env_sh = config['configurations']['spark2-env']['content'] +spark_log4j_properties = config['configurations']['spark2-log4j-properties']['content'] +spark_metrics_properties = config['configurations']['spark2-metrics-properties']['content'] + +hive_server_host = default("/clusterHostInfo/hive_server_host", []) +is_hive_installed = not len(hive_server_host) == 0 + +security_enabled = config['configurations']['cluster-env']['security_enabled'] +kinit_path_local = get_kinit_path(default('/configurations/kerberos-env/executable_search_paths', None)) +spark_kerberos_keytab = config['configurations']['spark2-defaults']['spark.history.kerberos.keytab'] +spark_kerberos_principal = config['configurations']['spark2-defaults']['spark.history.kerberos.principal'] + +spark_thriftserver_hosts = default("/clusterHostInfo/spark2_thriftserver_hosts", []) +has_spark_thriftserver = not len(spark_thriftserver_hosts) == 0 + +# hive-site params +spark_hive_properties = { + 'hive.metastore.uris': config['configurations']['hive-site']['hive.metastore.uris'] +} + +# security settings +if security_enabled: + spark_principal = spark_kerberos_principal.replace('_HOST',spark_history_server_host.lower()) + + if is_hive_installed: + spark_hive_properties.update({ + 'hive.metastore.sasl.enabled': str(config['configurations']['hive-site']['hive.metastore.sasl.enabled']).lower(), + 'hive.metastore.kerberos.keytab.file': config['configurations']['hive-site']['hive.metastore.kerberos.keytab.file'], + 'hive.server2.authentication.spnego.principal': config['configurations']['hive-site']['hive.server2.authentication.spnego.principal'], + 'hive.server2.authentication.spnego.keytab': config['configurations']['hive-site']['hive.server2.authentication.spnego.keytab'], + 'hive.metastore.kerberos.principal': config['configurations']['hive-site']['hive.metastore.kerberos.principal'], + 'hive.server2.authentication.kerberos.principal': config['configurations']['hive-site']['hive.server2.authentication.kerberos.principal'], + 'hive.server2.authentication.kerberos.keytab': config['configurations']['hive-site']['hive.server2.authentication.kerberos.keytab'], + 'hive.server2.authentication': config['configurations']['hive-site']['hive.server2.authentication'], + }) + + hive_kerberos_keytab = config['configurations']['hive-site']['hive.server2.authentication.kerberos.keytab'] + hive_kerberos_principal = config['configurations']['hive-site']['hive.server2.authentication.kerberos.principal'] + +# thrift server support - available on HDP 2.3 or higher +spark_thrift_sparkconf = None +spark_thrift_cmd_opts_properties = '' +spark_thrift_fairscheduler_content = None +spark_thrift_master = "yarn-client" +if 'nm_hosts' in config['clusterHostInfo'] and len(config['clusterHostInfo']['nm_hosts']) == 1: + # use local mode when there's only one nodemanager + spark_thrift_master = "local[4]" + +if has_spark_thriftserver and 'spark-thrift-sparkconf' in config['configurations']: + spark_thrift_sparkconf = config['configurations']['spark2-thrift-sparkconf'] + spark_thrift_cmd_opts_properties = config['configurations']['spark2-env']['spark_thrift_cmd_opts'] + if is_hive_installed: + # update default metastore client properties (async wait for metastore component) it is useful in case of + # blueprint provisioning when hive-metastore and spark-thriftserver is not on the same host. + spark_hive_properties.update({ + 'hive.metastore.client.socket.timeout' : config['configurations']['hive-site']['hive.metastore.client.socket.timeout'] + }) + spark_hive_properties.update(config['configurations']['spark2-hive-site-override']) + + if 'spark-thrift-fairscheduler' in config['configurations'] and 'fairscheduler_content' in config['configurations']['spark2-thrift-fairscheduler']: + spark_thrift_fairscheduler_content = config['configurations']['spark2-thrift-fairscheduler']['fairscheduler_content'] + +default_fs = config['configurations']['core-site']['fs.defaultFS'] +hdfs_site = config['configurations']['hdfs-site'] + +dfs_type = default("/commandParams/dfs_type", "") + +import functools +#create partial functions with common arguments for every HdfsResource call +#to create/delete hdfs directory/file/copyfromlocal we need to call params.HdfsResource in code +HdfsResource = functools.partial( + HdfsResource, + user=hdfs_user, + hdfs_resource_ignore_file = "/var/lib/ambari-agent/data/.hdfs_resource_ignore", + security_enabled = security_enabled, + keytab = hdfs_user_keytab, + kinit_path_local = kinit_path_local, + hadoop_bin_dir = hadoop_bin_dir, + hadoop_conf_dir = hadoop_conf_dir, + principal_name = hdfs_principal_name, + hdfs_site = hdfs_site, + default_fs = default_fs, + immutable_paths = get_not_managed_resources(), + dfs_type = dfs_type + ) http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/service_check.py ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/service_check.py b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/service_check.py new file mode 100755 index 0000000..694f046 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/service_check.py @@ -0,0 +1,43 @@ +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agree in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +""" +import subprocess +import time + +from resource_management import * +from resource_management.libraries.script.script import Script +from resource_management.libraries.functions.format import format +from resource_management.core.resources.system import Execute +from resource_management.core.logger import Logger + +class SparkServiceCheck(Script): + def service_check(self, env): + import params + env.set_params(params) + + if params.security_enabled: + spark_kinit_cmd = format("{kinit_path_local} -kt {spark_kerberos_keytab} {spark_principal}; ") + Execute(spark_kinit_cmd, user=params.spark_user) + + Execute(format("curl -s -o /dev/null -w'%{{http_code}}' --negotiate -u: -k http://{spark_history_server_host}:{spark_history_ui_port} | grep 200"), + tries = 10, + try_sleep=3, + logoutput=True + ) + +if __name__ == "__main__": + SparkServiceCheck().execute() http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/setup_spark.py ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/setup_spark.py b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/setup_spark.py new file mode 100755 index 0000000..9316ba9 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/setup_spark.py @@ -0,0 +1,108 @@ +#!/usr/bin/python +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +""" + +import sys +import fileinput +import shutil +import os +from resource_management import * +from resource_management.core.exceptions import ComponentIsNotRunning +from resource_management.core.logger import Logger +from resource_management.core import shell +from resource_management.libraries.functions.version import format_stack_version +from resource_management.libraries.functions.stack_features import check_stack_feature +from resource_management.libraries.functions import StackFeature + +def setup_spark(env, type, upgrade_type = None, action = None): + import params + + Directory([params.spark_pid_dir, params.spark_log_dir], + owner=params.spark_user, + group=params.user_group, + mode=0775, + create_parents = True + ) + if type == 'server' and action == 'config': + params.HdfsResource(params.spark_hdfs_user_dir, + type="directory", + action="create_on_execute", + owner=params.spark_user, + mode=0775 + ) + params.HdfsResource(None, action="execute") + + PropertiesFile(format("{spark_conf}/spark-defaults.conf"), + properties = params.config['configurations']['spark2-defaults'], + key_value_delimiter = " ", + owner=params.spark_user, + group=params.spark_group, + ) + + # create spark-env.sh in etc/conf dir + File(os.path.join(params.spark_conf, 'spark-env.sh'), + owner=params.spark_user, + group=params.spark_group, + content=InlineTemplate(params.spark_env_sh), + mode=0644, + ) + + #create log4j.properties in etc/conf dir + File(os.path.join(params.spark_conf, 'log4j.properties'), + owner=params.spark_user, + group=params.spark_group, + content=params.spark_log4j_properties, + mode=0644, + ) + + #create metrics.properties in etc/conf dir + File(os.path.join(params.spark_conf, 'metrics.properties'), + owner=params.spark_user, + group=params.spark_group, + content=InlineTemplate(params.spark_metrics_properties) + ) + + if params.is_hive_installed: + XmlConfig("hive-site.xml", + conf_dir=params.spark_conf, + configurations=params.spark_hive_properties, + owner=params.spark_user, + group=params.spark_group, + mode=0644) + + if params.has_spark_thriftserver: + PropertiesFile(params.spark_thrift_server_conf_file, + properties = params.config['configurations']['spark2-thrift-sparkconf'], + owner = params.hive_user, + group = params.user_group, + key_value_delimiter = " ", + ) + + effective_version = params.version if upgrade_type is not None else params.stack_version_formatted + if effective_version: + effective_version = format_stack_version(effective_version) + + if params.spark_thrift_fairscheduler_content and effective_version and check_stack_feature(StackFeature.SPARK_16PLUS, effective_version): + # create spark-thrift-fairscheduler.xml + File(os.path.join(params.spark_conf,"spark-thrift-fairscheduler.xml"), + owner=params.spark_user, + group=params.spark_group, + mode=0755, + content=InlineTemplate(params.spark_thrift_fairscheduler_content) + ) http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_client.py ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_client.py b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_client.py new file mode 100755 index 0000000..434b4b1 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_client.py @@ -0,0 +1,61 @@ +#!/usr/bin/python +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +""" + +import sys +from resource_management import * +from resource_management.libraries.functions import conf_select +from resource_management.libraries.functions import stack_select +from resource_management.libraries.functions.stack_features import check_stack_feature +from resource_management.libraries.functions import StackFeature +from resource_management.core.exceptions import ClientComponentHasNoStatus +from resource_management.core.logger import Logger +from resource_management.core import shell +from setup_spark import setup_spark + + +class SparkClient(Script): + def install(self, env): + self.install_packages(env) + self.configure(env) + + def configure(self, env, upgrade_type=None): + import params + env.set_params(params) + + setup_spark(env, 'client', upgrade_type=upgrade_type, action = 'config') + + def status(self, env): + raise ClientComponentHasNoStatus() + + def get_component_name(self): + return "spark2-client" + + def pre_upgrade_restart(self, env, upgrade_type=None): + import params + + env.set_params(params) + if params.version and check_stack_feature(StackFeature.ROLLING_UPGRADE, params.version): + Logger.info("Executing Spark2 Client Stack Upgrade pre-restart") + conf_select.select(params.stack_name, "spark", params.version) + stack_select.select("spark2-client", params.version) + +if __name__ == "__main__": + SparkClient().execute() + http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_service.py ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_service.py b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_service.py new file mode 100755 index 0000000..2eae3e7 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_service.py @@ -0,0 +1,141 @@ +#!/usr/bin/env python + +''' +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. +''' +import socket +import tarfile +import os +from contextlib import closing + +from resource_management.libraries.script.script import Script +from resource_management.libraries.resources.hdfs_resource import HdfsResource +from resource_management.libraries.functions.copy_tarball import copy_to_hdfs +from resource_management.libraries.functions import format +from resource_management.core.resources.system import File, Execute +from resource_management.libraries.functions.version import format_stack_version +from resource_management.libraries.functions.stack_features import check_stack_feature +from resource_management.libraries.functions import StackFeature +from resource_management.libraries.functions.show_logs import show_logs + + +def make_tarfile(output_filename, source_dir): + try: + os.remove(output_filename) + except OSError: + pass + parent_dir=os.path.dirname(output_filename) + if not os.path.exists(parent_dir): + os.makedirs(parent_dir) + with closing(tarfile.open(output_filename, "w:gz")) as tar: + tar.add(source_dir, arcname=os.path.basename(source_dir)) + + +def spark_service(name, upgrade_type=None, action=None): + import params + + if action == 'start': + + effective_version = params.version if upgrade_type is not None else params.stack_version_formatted + if effective_version: + effective_version = format_stack_version(effective_version) + + if effective_version and check_stack_feature(StackFeature.SPARK_16PLUS, effective_version): + # create & copy spark2-hdp-yarn-archive.tar.gz to hdfs + source_dir=params.spark_home+"/jars" + tmp_archive_file="/tmp/spark2/spark2-hdp-yarn-archive.tar.gz" + make_tarfile(tmp_archive_file, source_dir) + copy_to_hdfs("spark2", params.user_group, params.hdfs_user, host_sys_prepped=params.host_sys_prepped) + # create spark history directory + params.HdfsResource(params.spark_history_dir, + type="directory", + action="create_on_execute", + owner=params.spark_user, + group=params.user_group, + mode=0777, + recursive_chmod=True + ) + params.HdfsResource(None, action="execute") + + if params.security_enabled: + spark_kinit_cmd = format("{kinit_path_local} -kt {spark_kerberos_keytab} {spark_principal}; ") + Execute(spark_kinit_cmd, user=params.spark_user) + + # Spark 1.3.1.2.3, and higher, which was included in HDP 2.3, does not have a dependency on Tez, so it does not + # need to copy the tarball, otherwise, copy it. + if params.stack_version_formatted and check_stack_feature(StackFeature.TEZ_FOR_SPARK, params.stack_version_formatted): + resource_created = copy_to_hdfs("tez", params.user_group, params.hdfs_user, host_sys_prepped=params.host_sys_prepped) + if resource_created: + params.HdfsResource(None, action="execute") + + if name == 'jobhistoryserver': + historyserver_no_op_test = format( + 'ls {spark_history_server_pid_file} >/dev/null 2>&1 && ps -p `cat {spark_history_server_pid_file}` >/dev/null 2>&1') + try: + Execute(format('{spark_history_server_start}'), + user=params.spark_user, + environment={'JAVA_HOME': params.java_home}, + not_if=historyserver_no_op_test) + except: + show_logs(params.spark_log_dir, user=params.spark_user) + raise + + elif name == 'sparkthriftserver': + if params.security_enabled: + hive_principal = params.hive_kerberos_principal.replace('_HOST', socket.getfqdn().lower()) + hive_kinit_cmd = format("{kinit_path_local} -kt {hive_kerberos_keytab} {hive_principal}; ") + Execute(hive_kinit_cmd, user=params.hive_user) + + thriftserver_no_op_test = format( + 'ls {spark_thrift_server_pid_file} >/dev/null 2>&1 && ps -p `cat {spark_thrift_server_pid_file}` >/dev/null 2>&1') + try: + Execute(format('{spark_thrift_server_start} --properties-file {spark_thrift_server_conf_file} {spark_thrift_cmd_opts_properties}'), + user=params.hive_user, + environment={'JAVA_HOME': params.java_home}, + not_if=thriftserver_no_op_test + ) + except: + show_logs(params.spark_log_dir, user=params.hive_user) + raise + elif action == 'stop': + if name == 'jobhistoryserver': + try: + Execute(format('{spark_history_server_stop}'), + user=params.spark_user, + environment={'JAVA_HOME': params.java_home} + ) + except: + show_logs(params.spark_log_dir, user=params.spark_user) + raise + File(params.spark_history_server_pid_file, + action="delete" + ) + + elif name == 'sparkthriftserver': + try: + Execute(format('{spark_thrift_server_stop}'), + user=params.hive_user, + environment={'JAVA_HOME': params.java_home} + ) + except: + show_logs(params.spark_log_dir, user=params.hive_user) + raise + File(params.spark_thrift_server_pid_file, + action="delete" + ) + + http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_thrift_server.py ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_thrift_server.py b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_thrift_server.py new file mode 100755 index 0000000..be65986 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/spark_thrift_server.py @@ -0,0 +1,87 @@ +#!/usr/bin/python +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +""" + +import sys +import os + +from resource_management.libraries.script.script import Script +from resource_management.libraries.functions import conf_select +from resource_management.libraries.functions import stack_select +from resource_management.libraries.functions.stack_features import check_stack_feature +from resource_management.libraries.functions import StackFeature +from resource_management.libraries.functions.check_process_status import check_process_status +from resource_management.core.logger import Logger +from resource_management.core import shell +from setup_spark import setup_spark +from spark_service import spark_service + + +class SparkThriftServer(Script): + + def install(self, env): + import params + env.set_params(params) + + self.install_packages(env) + + def configure(self, env ,upgrade_type=None): + import params + env.set_params(params) + setup_spark(env, 'server', upgrade_type = upgrade_type, action = 'config') + + def start(self, env, upgrade_type=None): + import params + env.set_params(params) + + self.configure(env) + spark_service('sparkthriftserver', upgrade_type=upgrade_type, action='start') + + def stop(self, env, upgrade_type=None): + import params + env.set_params(params) + spark_service('sparkthriftserver', upgrade_type=upgrade_type, action='stop') + + def status(self, env): + import status_params + env.set_params(status_params) + check_process_status(status_params.spark_thrift_server_pid_file) + + def get_component_name(self): + return "spark2-thriftserver" + + def pre_upgrade_restart(self, env, upgrade_type=None): + import params + + env.set_params(params) + if params.version and check_stack_feature(StackFeature.SPARK2_THRIFTSERVER, params.version): + Logger.info("Executing Spark2 Thrift Server Stack Upgrade pre-restart") + conf_select.select(params.stack_name, "spark2", params.version) + stack_select.select("spark2-thriftserver", params.version) + + def get_log_folder(self): + import params + return params.spark_log_dir + + def get_user(self): + import params + return params.hive_user + +if __name__ == "__main__": + SparkThriftServer().execute() http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/status_params.py ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/status_params.py b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/status_params.py new file mode 100755 index 0000000..4196638 --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/package/scripts/status_params.py @@ -0,0 +1,39 @@ +#!/usr/bin/env python +""" +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, software +distributed under the License is distributed on an "AS IS" BASIS, +WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +See the License for the specific language governing permissions and +limitations under the License. + +""" + +from resource_management.libraries.functions import format +from resource_management.libraries.script.script import Script +from resource_management.libraries.functions.default import default + +config = Script.get_config() + +spark_user = config['configurations']['spark2-env']['spark_user'] +spark_group = config['configurations']['spark2-env']['spark_group'] +user_group = config['configurations']['cluster-env']['user_group'] + +if 'hive-env' in config['configurations']: + hive_user = config['configurations']['hive-env']['hive_user'] +else: + hive_user = "hive" + +spark_pid_dir = config['configurations']['spark2-env']['spark_pid_dir'] +spark_history_server_pid_file = format("{spark_pid_dir}/spark-{spark_user}-org.apache.spark.deploy.history.HistoryServer-1.pid") +spark_thrift_server_pid_file = format("{spark_pid_dir}/spark-{hive_user}-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1.pid") +stack_name = default("/hostLevelParams/stack_name", None) http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/quicklinks/quicklinks.json ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/quicklinks/quicklinks.json b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/quicklinks/quicklinks.json new file mode 100755 index 0000000..076755c --- /dev/null +++ b/ambari-server/src/main/resources/common-services/SPARK2/2.0.0/quicklinks/quicklinks.json @@ -0,0 +1,27 @@ +{ + "name": "default", + "description": "default quick links configuration", + "configuration": { + "protocol": + { + "type":"HTTP_ONLY" + }, + + "links": [ + { + "name": "spark2_history_server_ui", + "label": "Spark2 History Server UI", + "requires_user_name": "false", + "url": "%@://%@:%@", + "port":{ + "http_property": "spark.history.ui.port", + "http_default_port": "18081", + "https_property": "spark.history.ui.port", + "https_default_port": "18081", + "regex": "^(\\d+)$", + "site": "spark-defaults" + } + } + ] + } +} http://git-wip-us.apache.org/repos/asf/ambari/blob/27168a13/ambari-server/src/main/resources/stacks/HDP/2.5/services/SPARK2/metainfo.xml ---------------------------------------------------------------------- diff --git a/ambari-server/src/main/resources/stacks/HDP/2.5/services/SPARK2/metainfo.xml b/ambari-server/src/main/resources/stacks/HDP/2.5/services/SPARK2/metainfo.xml new file mode 100755 index 0000000..118e8c0 --- /dev/null +++ b/ambari-server/src/main/resources/stacks/HDP/2.5/services/SPARK2/metainfo.xml @@ -0,0 +1,30 @@ + + + + 2.0 + + + SPARK2 + 2.0.x.2.5 + common-services/SPARK2/2.0.0 + + +