zeppelin-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From b..@apache.org
Subject svn commit: r1762399 [14/16] - in /zeppelin/site/docs/0.7.0-SNAPSHOT: ./ assets/themes/zeppelin/img/docs-img/ development/ displaysystem/ install/ interpreter/ manual/ quickstart/ rest-api/ security/ storage/
Date Tue, 27 Sep 2016 03:33:45 GMT
Modified: zeppelin/site/docs/0.7.0-SNAPSHOT/search_data.json
URL: http://svn.apache.org/viewvc/zeppelin/site/docs/0.7.0-SNAPSHOT/search_data.json?rev=1762399&r1=1762398&r2=1762399&view=diff
==============================================================================
--- zeppelin/site/docs/0.7.0-SNAPSHOT/search_data.json (original)
+++ zeppelin/site/docs/0.7.0-SNAPSHOT/search_data.json Tue Sep 27 03:33:44 2016
@@ -4,7 +4,7 @@
 
     "/atom.xml": {
       "title": "Atom Feed",
-      "content"  : " Apache Zeppelin   2016-09-12T06:27:44+02:00 http://zeppelin.apache.org    The Apache Software Foundation   dev@zeppelin.apache.org  ",
+      "content"  : " Apache Zeppelin   2016-09-26T20:27:17-07:00 http://zeppelin.apache.org    The Apache Software Foundation   dev@zeppelin.apache.org  ",
       "url": " /atom.xml",
       "group": "",
       "excerpt": ""
@@ -48,7 +48,7 @@
 
     "/development/writingzeppelininterpreter.html": {
       "title": "Writing a New Interpreter",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Writing a New InterpreterWhat is Apache Zeppelin InterpreterApache Zeppelin Interpreter is a language backend. For example to use scala code in Zeppelin, you need a scala interpreter.Every Interpreters belongs to an InterpreterGroup.Interpreters in the same InterpreterGroup can reference each other. For example, SparkSqlInterpreter can reference SparkInterpreter to get SparkContext from it while they're in the same group.In
 terpreterSetting is configuration of a given InterpreterGroup and a unit of start/stop interpreter.All Interpreters in the same InterpreterSetting are launched in a single, separate JVM process. The Interpreter communicates with Zeppelin engine via Thrift.In 'Separate Interpreter(scoped / isolated) for each note' mode which you can see at the Interpreter Setting menu when you create a new interpreter, new interpreter instance will be created per notebook. But it still runs on the same JVM while they're in the same InterpreterSettings.Make your own InterpreterCreating a new interpreter is quite simple. Just extend org.apache.zeppelin.interpreter abstract class and implement some methods.You can include org.apache.zeppelin:zeppelin-interpreter:[VERSION] artifact in your build system. And you should put your jars under your interpreter directory with a specific directory name. Zeppelin server reads interpreter directories recursively and initializes interpreters
  including your own interpreter.There are three locations where you can store your interpreter group, name and other information. Zeppelin server tries to find the location below. Next, Zeppelin tries to find interpreter-setting.json in your interpreter jar.{ZEPPELIN_INTERPRETER_DIR}/{YOUR_OWN_INTERPRETER_DIR}/interpreter-setting.jsonHere is an example of interpreter-setting.json on your own interpreter.[  {    "group": "your-group",    "name": "your-name",    "className": "your.own.interpreter.class",    "properties": {      "properties1": {        "envName": null,        "propertyName": "property.1.name",        "defaultValue": "propertyDefaultValue",        "description": "Property description"      },      &qu
 ot;properties2": {        "envName": PROPERTIES_2,        "propertyName": null,        "defaultValue": "property2DefaultValue",        "description": "Property 2 description"      }, ...    }  },  {    ...  }]Finally, Zeppelin uses static initialization with the following:static {    Interpreter.register("MyInterpreterName", MyClassName.class.getName());  }Static initialization is deprecated and will be supported until 0.6.0.The name will appear later in the interpreter name option box during the interpreter configuration process.The name of the interpreter is what you later write to identify a paragraph which should be interpreted using this interpreter.%MyInterpreterNamesome interpreter specific code...Programming Languages for InterpreterIf the interpreter uses a specific programming language ( like Scala, Python, SQL ), it is generally recommende
 d to add a syntax highlighting supported for that to the notebook paragraph editor.  To check out the list of languages supported, see the mode-*.js files under zeppelin-web/bower_components/ace-builds/src-noconflict or from github.com/ajaxorg/ace-builds.  If you want to add a new set of syntax highlighting,  Add the mode-*.js file to zeppelin-web/bower.json ( when built, zeppelin-web/src/index.html will be changed automatically. ).Add to the list of editorMode in zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js - it follows the pattern 'ace/mode/x' where x is the name.Add to the code that checks for % prefix and calls session.setMode(editorMode.x) in setParagraphMode located in zeppelin-web/src/app/notebook/paragraph/paragraph.controller.js.Install your interpreter binaryOnce you have built your interpreter, you can place it under the interpreter directory with all its dependencies.[ZEPPELIN_HOME]/interpreter/[INTERPRETER_NAME]/Configure your interpre
 terTo configure your interpreter you need to follow these steps:Add your interpreter class name to the zeppelin.interpreters property in conf/zeppelin-site.xml.Property value is comma separated [INTERPRETER_CLASS_NAME].For example,<property><name>zeppelin.interpreters</name><value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter</value></property>Add your interpreter to the default configuration which is used when there is no zeppelin-site.xml.Start Zeppelin by running ./bin/zeppelin-daemon.sh start.In the interpreter page, click the +Create button and configure your interpreter properties.Now you are done and ready to use your inter
 preter.Note : Interpreters released with zeppelin have a default configuration which is used when there is no conf/zeppelin-site.xml.Use your interpreter0.5.0Inside of a notebook, %[INTERPRETER_NAME] directive will call your interpreter.Note that the first interpreter configuration in zeppelin.interpreters will be the default one.For example,%myintpval a = "My interpreter"println(a)0.6.0 and laterInside of a notebook, %[INTERPRETER_GROUP].[INTERPRETER_NAME] directive will call your interpreter.You can omit either [INTERPRETER_GROUP] or [INTERPRETER_NAME]. If you omit [INTERPRETER_NAME], then first available interpreter will be selected in the [INTERPRETER_GROUP].Likewise, if you skip [INTERPRETER_GROUP], then [INTERPRETER_NAME] will be chosen from default interpreter group.For example, if you have two interpreter myintp1 and myintp2 in group mygrp, you can call myintp1 like%mygrp.myintp1codes for myintp1and you can call myintp2 like%mygrp.myintp2codes for myintp2If
  you omit your interpreter name, it'll select first available interpreter in the group ( myintp1 ).%mygrpcodes for myintp1You can only omit your interpreter group when your interpreter group is selected as a default group.%myintp2codes for myintp2ExamplesCheckout some interpreters released with Zeppelin by default.sparkmarkdownshelljdbcContributing a new Interpreter to Zeppelin releasesWe welcome contribution to a new interpreter. Please follow these few steps:First, check out the general contribution guide here.Follow the steps in Make your own Interpreter section above.Add your interpreter as in the Configure your interpreter section above; also add it to the example template zeppelin-site.xml.template.Add tests! They are run by Travis for all changes and it is important that they are self-contained.Include your interpreter as a module in pom.xml.Add documentation on how to use your interpreter under docs/interpreter/. Follow the Markdown style as this example. Make sure y
 ou list config settings and provide working examples on using your interpreter in code boxes in Markdown. Link to images as appropriate (images should go to docs/assets/themes/zeppelin/img/docs-img/). And add a link to your documentation in the navigation menu (docs/_includes/themes/zeppelin/_navigation.html).Most importantly, ensure licenses of the transitive closure of all dependencies are list in license file.Commit your changes and open a Pull Request on the project Mirror on GitHub; check to make sure Travis CI build is passing.",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Writing a New InterpreterWhat is Apache Zeppelin InterpreterApache Zeppelin Interpreter is a language backend. For example to use scala code in Zeppelin, you need a scala interpreter.Every Interpreters belongs to an InterpreterGroup.Interpreters in the same InterpreterGroup can reference each other. For example, SparkSqlInterpreter can reference SparkInterpreter to get SparkContext from it while they're in the same group.In
 terpreterSetting is configuration of a given InterpreterGroup and a unit of start/stop interpreter.All Interpreters in the same InterpreterSetting are launched in a single, separate JVM process. The Interpreter communicates with Zeppelin engine via Thrift.In 'Separate Interpreter(scoped / isolated) for each note' mode which you can see at the Interpreter Setting menu when you create a new interpreter, new interpreter instance will be created per notebook. But it still runs on the same JVM while they're in the same InterpreterSettings.Make your own InterpreterCreating a new interpreter is quite simple. Just extend org.apache.zeppelin.interpreter abstract class and implement some methods.You can include org.apache.zeppelin:zeppelin-interpreter:[VERSION] artifact in your build system. And you should put your jars under your interpreter directory with a specific directory name. Zeppelin server reads interpreter directories recursively and initializes interpreters
  including your own interpreter.There are three locations where you can store your interpreter group, name and other information. Zeppelin server tries to find the location below. Next, Zeppelin tries to find interpreter-setting.json in your interpreter jar.{ZEPPELIN_INTERPRETER_DIR}/{YOUR_OWN_INTERPRETER_DIR}/interpreter-setting.jsonHere is an example of interpreter-setting.json on your own interpreter. Note that if you don't specify editor object, your interpreter will use plain text mode for syntax highlighting.[  {    "group": "your-group",    "name": "your-name",    "className": "your.own.interpreter.class",    "properties": {      "properties1": {        "envName": null,        "propertyName": "property.1.name",        "defaultValue": "propertyDefa
 ultValue",        "description": "Property description"      },      "properties2": {        "envName": PROPERTIES_2,        "propertyName": null,        "defaultValue": "property2DefaultValue",        "description": "Property 2 description"      }, ...    },    "editor": {      "language": "your-syntax-highlight-language"    }  },  {    ...  }]Finally, Zeppelin uses static initialization with the following:static {  Interpreter.register("MyInterpreterName", MyClassName.class.getName());}Static initialization is deprecated and will be supported until 0.6.0.The name will appear later in the interpreter name option box during the interpreter configuration process.The name of the interpreter is what you later write to identify a paragraph which sh
 ould be interpreted using this interpreter.%MyInterpreterNamesome interpreter specific code...Programming Languages for InterpreterIf the interpreter uses a specific programming language (like Scala, Python, SQL), it is generally recommended to add a syntax highlighting supported for that to the notebook paragraph editor.  To check out the list of languages supported, see the mode-*.js files under zeppelin-web/bower_components/ace-builds/src-noconflict or from github.com/ajaxorg/ace-builds.  If you want to add a new set of syntax highlighting,  Add the mode-*.js file to zeppelin-web/bower.json ( when built, zeppelin-web/src/index.html will be changed automatically. ).Add editor object to interpreter-setting.json file. If you want to set your language to java for example, add:"editor": {  "language": "java"}Install your interpreter binaryOnce you have built your interpreter, you can place it under the interpreter directory with al
 l its dependencies.[ZEPPELIN_HOME]/interpreter/[INTERPRETER_NAME]/Configure your interpreterTo configure your interpreter you need to follow these steps:Add your interpreter class name to the zeppelin.interpreters property in conf/zeppelin-site.xml.Property value is comma separated [INTERPRETER_CLASS_NAME].For example,<property><name>zeppelin.interpreters</name><value>org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,org.apache.zeppelin.hive.HiveInterpreter,com.me.MyNewInterpreter</value></property>Add your interpreter to the default configuration which is used when there is no zeppelin-site.xml.Start Zeppelin by running ./bin/zeppelin-daemon.sh start.In the interpreter page, click the +Create butt
 on and configure your interpreter properties.Now you are done and ready to use your interpreter.Note : Interpreters released with zeppelin have a default configuration which is used when there is no conf/zeppelin-site.xml.Use your interpreter0.5.0Inside of a notebook, %[INTERPRETER_NAME] directive will call your interpreter.Note that the first interpreter configuration in zeppelin.interpreters will be the default one.For example,%myintpval a = "My interpreter"println(a)0.6.0 and laterInside of a notebook, %[INTERPRETER_GROUP].[INTERPRETER_NAME] directive will call your interpreter.You can omit either [INTERPRETER_GROUP] or [INTERPRETER_NAME]. If you omit [INTERPRETER_NAME], then first available interpreter will be selected in the [INTERPRETER_GROUP].Likewise, if you skip [INTERPRETER_GROUP], then [INTERPRETER_NAME] will be chosen from default interpreter group.For example, if you have two interpreter myintp1 and myintp2 in group mygrp, you can call myintp1 like%myg
 rp.myintp1codes for myintp1and you can call myintp2 like%mygrp.myintp2codes for myintp2If you omit your interpreter name, it'll select first available interpreter in the group ( myintp1 ).%mygrpcodes for myintp1You can only omit your interpreter group when your interpreter group is selected as a default group.%myintp2codes for myintp2ExamplesCheckout some interpreters released with Zeppelin by default.sparkmarkdownshelljdbcContributing a new Interpreter to Zeppelin releasesWe welcome contribution to a new interpreter. Please follow these few steps:First, check out the general contribution guide here.Follow the steps in Make your own Interpreter section above.Add your interpreter as in the Configure your interpreter section above; also add it to the example template zeppelin-site.xml.template.Add tests! They are run by Travis for all changes and it is important that they are self-contained.Include your interpreter as a module in pom.xml.Add documentation on how to use your in
 terpreter under docs/interpreter/. Follow the Markdown style as this example. Make sure you list config settings and provide working examples on using your interpreter in code boxes in Markdown. Link to images as appropriate (images should go to docs/assets/themes/zeppelin/img/docs-img/). And add a link to your documentation in the navigation menu (docs/_includes/themes/zeppelin/_navigation.html).Most importantly, ensure licenses of the transitive closure of all dependencies are list in license file.Commit your changes and open a Pull Request on the project Mirror on GitHub; check to make sure Travis CI build is passing.",
       "url": " /development/writingzeppelininterpreter.html",
       "group": "development",
       "excerpt": "Apache Zeppelin Interpreter is a language backend. Every Interpreters belongs to an InterpreterGroup. Interpreters in the same InterpreterGroup can reference each other."
@@ -93,10 +93,10 @@
 
     "/install/install.html": {
       "title": "Quick Start",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Quick StartWelcome to your first trial to explore Apache Zeppelin! This page will help you to get started and here is the list of topics covered.InstallationApache Zeppelin officially supports and is tested on next environments.      Name    Value        Oracle JDK    1.7  (set JAVA_HOME)        OS    Mac OSX  Ubuntu 14.X  CentOS 6.X  Windows 7 Pro SP1  There are two options to install Apache Zeppelin on your machine. One is downlo
 ading pre-built binary package from the archive. You can download not only the latest stable version but also the older one if you need. The other option is building from the source.Although it can be unstable somehow since it is on development status, you can explore newly added feature and change it as you want.Downloading Binary PackageIf you want to install Apache Zeppelin with a stable binary package, please visit Apache Zeppelin download Page. If you have downloaded netinst binary, install additional interpreters before you start Zeppelin. Or simply run ./bin/install-interpreter.sh --all.After unpacking, jump to Starting Apache Zeppelin with Command Line section.Building from SourceIf you want to build from the source, the software below needs to be installed on your system.      Name    Value        Git            Maven    3.1.x or higher  If you don't have it installed yet, please check Before Build section and follow step by step instructions from there.1. Clone Apa
 che Zeppelin repositorygit clone https://github.com/apache/zeppelin.git2. Build source with optionsEach interpreters requires different build options. For the further information about options, please see Build section.mvn clean package -DskipTests [Options]Here are some examples with several options# build with spark-2.0, scala-2.11./dev/change_scala_version.sh 2.11mvn clean package -Pspark-2.0 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -Pscala-2.11# build with spark-1.6, scala-2.10mvn clean package -Pspark-1.6 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr# spark-cassandra integrationmvn clean package -Pcassandra-spark-1.5 -Dhadoop.version=2.6.0 -Phadoop-2.6 -DskipTests# with CDHmvn clean package -Pspark-1.5 -Dhadoop.version=2.6.0-cdh5.5.0 -Phadoop-2.6 -Pvendor-repo -DskipTests# with MapRmvn clean package -Pspark-1.5 -Pmapr50 -DskipTestsFor the further information about building with source, please see README.md in Zeppelin repository.Starting Apache Zeppelin with Command LineStart Zeppelinbi
 n/zeppelin-daemon.sh startIf you are using Windows binzeppelin.cmdAfter successful start, visit http://localhost:8080 with your web browser.Stop Zeppelinbin/zeppelin-daemon.sh stop(Optional) Start Apache Zeppelin with a service managerNote : The below description was written based on Ubuntu Linux.Apache Zeppelin can be auto started as a service with an init script, such as services managed by upstart.The following is an example of upstart script to be saved as /etc/init/zeppelin.confThis also allows the service to be managed with commands such assudo service zeppelin start  sudo service zeppelin stop  sudo service zeppelin restartOther service managers could use a similar approach with the upstart argument passed to the zeppelin-daemon.sh script.bin/zeppelin-daemon.sh upstartzeppelin.confdescription "zeppelin"start on (local-filesystems and net-device-up IFACE!=lo)stop on shutdown# Respawn the process on unexpected terminationrespawn# respawn the job up to 7 times 
 within a 5 second period.# If the job exceeds these values, it will be stopped and marked as failed.respawn limit 7 5# zeppelin was installed in /usr/share/zeppelin in this examplechdir /usr/share/zeppelinexec bin/zeppelin-daemon.sh upstartWhat is the next?Congratulation on your successful Apache Zeppelin installation! Here are two next steps you might need.If you are new to Apache ZeppelinFor an in-depth overview of Apache Zeppelin UI, head to Explore Apache Zeppelin UI.After getting familiar with Apache Zeppelin UI, have fun with a short walk-through Tutorial that uses Apache Spark backend.If you need more configuration setting for Apache Zeppelin, jump to the next section: Apache Zeppelin Configuration.If you need more information about Spark or JDBC interpreter settingApache Zeppelin provides deep integration with Apache Spark. For the further informtation, see Spark Interpreter for Apache Zeppelin. Also, you can use generic JDBC connections in Apache Zeppelin. Go to Generic JDB
 C Interpreter for Apache Zeppelin.If you are in multi-user environmentYou can set permissions for your notebooks and secure data resource in multi-user environment. Go to More -> Security section.Apache Zeppelin ConfigurationYou can configure Apache Zeppelin with both environment variables in conf/zeppelin-env.sh (confzeppelin-env.cmd for Windows) and Java properties in conf/zeppelin-site.xml. If both are defined, then the environment variables will take priority.      zeppelin-env.sh    zeppelin-site.xml    Default value    Description        ZEPPELIN_PORT    zeppelin.server.port    8080    Zeppelin server port        ZEPPELIN_MEM    N/A    -Xmx1024m -XX:MaxPermSize=512m    JVM mem options        ZEPPELIN_INTP_MEM    N/A    ZEPPELIN_MEM    JVM mem options for interpreter process        ZEPPELIN_JAVA_OPTS    N/A        JVM options        ZEPPELIN_ALLOWED_ORIGINS    zeppelin.server.allowed.origins    *    Enables a way to specify a ',' separated list of allowed origins
  for rest and websockets.  i.e. http://localhost:8080           N/A    zeppelin.anonymous.allowed    true    Anonymous user is allowed by default.        ZEPPELIN_SERVER_CONTEXT_PATH    zeppelin.server.context.path    /    A context path of the web application        ZEPPELIN_SSL    zeppelin.ssl    false            ZEPPELIN_SSL_CLIENT_AUTH    zeppelin.ssl.client.auth    false            ZEPPELIN_SSL_KEYSTORE_PATH    zeppelin.ssl.keystore.path    keystore            ZEPPELIN_SSL_KEYSTORE_TYPE    zeppelin.ssl.keystore.type    JKS            ZEPPELIN_SSL_KEYSTORE_PASSWORD    zeppelin.ssl.keystore.password                ZEPPELIN_SSL_KEY_MANAGER_PASSWORD    zeppelin.ssl.key.manager.password                ZEPPELIN_SSL_TRUSTSTORE_PATH    zeppelin.ssl.truststore.path                ZEPPELIN_SSL_TRUSTSTORE_TYPE    zeppelin.ssl.truststore.type                ZEPPELIN_SSL_TRUSTSTORE_PASSWORD    zeppelin.ssl.truststore.password                ZEPPELIN_NOTEBOOK_HOMESCREEN    zeppelin.notebook.
 homescreen        A notebook id displayed in Apache Zeppelin homescreen i.e. 2A94M5J1Z        ZEPPELIN_NOTEBOOK_HOMESCREEN_HIDE    zeppelin.notebook.homescreen.hide    false    This value can be "true" when to hide the notebook id set by ZEPPELIN_NOTEBOOK_HOMESCREEN on the Apache Zeppelin homescreen. For the further information, please read Customize your Zeppelin homepage.        ZEPPELIN_WAR_TEMPDIR    zeppelin.war.tempdir    webapps    A location of jetty temporary directory        ZEPPELIN_NOTEBOOK_DIR    zeppelin.notebook.dir    notebook    The root directory where notebook directories are saved        ZEPPELIN_NOTEBOOK_S3_BUCKET    zeppelin.notebook.s3.bucket    zeppelin    S3 Bucket where notebook files will be saved        ZEPPELIN_NOTEBOOK_S3_USER    zeppelin.notebook.s3.user    user    A user name of S3 bucketi.e. bucket/user/notebook/2A94M5J1Z/note.json        ZEPPELIN_NOTEBOOK_S3_ENDPOINT    zeppelin.notebook.s3.endpoint    s3.amazonaws.com    Endpoint for the 
 bucket        ZEPPELIN_NOTEBOOK_S3_KMS_KEY_ID    zeppelin.notebook.s3.kmsKeyID        AWS KMS Key ID to use for encrypting data in S3 (optional)        ZEPPELIN_NOTEBOOK_S3_EMP    zeppelin.notebook.s3.encryptionMaterialsProvider        Class name of a custom S3 encryption materials provider implementation to use for encrypting data in S3 (optional)        ZEPPELIN_NOTEBOOK_AZURE_CONNECTION_STRING    zeppelin.notebook.azure.connectionString        The Azure storage account connection stringi.e. DefaultEndpointsProtocol=https;AccountName=<accountName>;AccountKey=<accountKey>        ZEPPELIN_NOTEBOOK_AZURE_SHARE    zeppelin.notebook.azure.share    zeppelin    Share where the notebook files will be saved        ZEPPELIN_NOTEBOOK_AZURE_USER    zeppelin.notebook.azure.user    user    An optional user name of Azure file sharei.e. share/user/notebook/2A94M5J1Z/note.json        ZEPPELIN_NOTEBOOK_STORAGE    zeppelin.notebook.storage    org.apache.zeppelin.notebook.
 repo.VFSNotebookRepo    Comma separated list of notebook storage        ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC    zeppelin.notebook.one.way.sync    false    If there are multiple notebook storages, should we treat the first one as the only source of truth?        ZEPPELIN_INTERPRETERS    zeppelin.interpreters      org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,    ...              Comma separated interpreter configurations [Class]       NOTE: This property is deprecated since Zeppelin-0.6.0 and will not be supported from Zeppelin-0.7.0            ZEPPELIN_INTERPRETER_DIR    zeppelin.interpreter.dir    interpreter    Interpreter directory        ZEPPELIN_WEBSOCKET_MAX_TEXT_MESSAGE_SIZE    zeppelin.websocket.max.text.message.size    1024000    Size in characters of the maximum text messa
 ge to be received by websocket.  ",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Quick StartWelcome to Apache Zeppelin! On this page are instructions to help you get started.InstallationApache Zeppelin officially supports and is tested on the following environments:      Name    Value        Oracle JDK    1.7  (set JAVA_HOME)        OS    Mac OSX  Ubuntu 14.X  CentOS 6.X  Windows 7 Pro SP1  To install Apache Zeppelin, you have two options:You can download pre-built binary packages from the archive. This is usua
 lly easier than building from source, and you can download the latest stable version (or older versions, if necessary).You can also build from source. This gives you a development version of Zeppelin, which is more unstable but has new features.Downloading Binary PackageStable binary packages are available on the Apache Zeppelin Download Page. You can download a default package with all interpreters, or you can download the net-install package, which lets you choose which interpreters to install.If you downloaded the default package, just unpack it in a directory of your choice and you're ready to go. If you downloaded the net-install package, you should manually install additional interpreters first. You can also install everything by running ./bin/install-interpreter.sh --all.After unpacking, jump to the Starting Apache Zeppelin with Command Line.Building from SourceIf you want to build from source, you must first install the following dependencies:      Name    Value     
    Git    (Any Version)        Maven    3.1.x or higher  If you haven't installed Git and Maven yet, check the Before Build section and follow the step by step instructions from there.1. Clone the Apache Zeppelin repositorygit clone https://github.com/apache/zeppelin.git2. Build source with optionsEach interpreter requires different build options. For more information about build options, please see the Build section.mvn clean package -DskipTests [Options]Here are some examples with several options:# build with spark-2.0, scala-2.11./dev/change_scala_version.sh 2.11mvn clean package -Pspark-2.0 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr -Pscala-2.11# build with spark-1.6, scala-2.10mvn clean package -Pspark-1.6 -Phadoop-2.4 -Pyarn -Ppyspark -Psparkr# spark-cassandra integrationmvn clean package -Pcassandra-spark-1.5 -Dhadoop.version=2.6.0 -Phadoop-2.6 -DskipTests# with CDHmvn clean package -Pspark-1.5 -Dhadoop.version=2.6.0-cdh5.5.0 -Phadoop-2.6 -Pvendor-repo -DskipTests# with M
 apRmvn clean package -Pspark-1.5 -Pmapr50 -DskipTestsFor further information about building from source, please see README.md in the Zeppelin repository.Starting Apache Zeppelin from the Command LineStarting Apache ZeppelinOn all platforms except for Windows:bin/zeppelin-daemon.sh startIf you are using Windows:binzeppelin.cmdAfter Zeppelin has started successfully, go to http://localhost:8080 with your web browser.Stopping Zeppelinbin/zeppelin-daemon.sh stop(Optional) Start Apache Zeppelin with a service managerNote : The below description was written based on Ubuntu Linux.Apache Zeppelin can be auto-started as a service with an init script, using a service manager like upstart.This is an example upstart script saved as /etc/init/zeppelin.confThis allows the service to be managed with commands such assudo service zeppelin start  sudo service zeppelin stop  sudo service zeppelin restartOther service managers could use a similar approach with the upstart argument passed to the zeppeli
 n-daemon.sh script.bin/zeppelin-daemon.sh upstartzeppelin.confdescription "zeppelin"start on (local-filesystems and net-device-up IFACE!=lo)stop on shutdown# Respawn the process on unexpected terminationrespawn# respawn the job up to 7 times within a 5 second period.# If the job exceeds these values, it will be stopped and marked as failed.respawn limit 7 5# zeppelin was installed in /usr/share/zeppelin in this examplechdir /usr/share/zeppelinexec bin/zeppelin-daemon.sh upstartNext Steps:Congratulations, you have successfully installed Apache Zeppelin! Here are two next steps you might find useful:If you are new to Apache Zeppelin...For an in-depth overview of the Apache Zeppelin UI, head to Explore Apache Zeppelin UI.After getting familiar with the Apache Zeppelin UI, have fun with a short walk-through Tutorial that uses the Apache Spark backend.If you need more configuration for Apache Zeppelin, jump to the next section: Apache Zeppelin Configuration.If you need 
 more information about Spark or JDBC interpreter settings...Apache Zeppelin provides deep integration with Apache Spark. For more informtation, see Spark Interpreter for Apache Zeppelin. You can also use generic JDBC connections in Apache Zeppelin. Go to Generic JDBC Interpreter for Apache Zeppelin.If you are in a multi-user environment...You can set permissions for your notebooks and secure data resource in a multi-user environment. Go to More -> Security section.Apache Zeppelin ConfigurationYou can configure Apache Zeppelin with either environment variables in conf/zeppelin-env.sh (confzeppelin-env.cmd for Windows) or Java properties in conf/zeppelin-site.xml. If both are defined, then the environment variables will take priority.      zeppelin-env.sh    zeppelin-site.xml    Default value    Description        ZEPPELIN_PORT    zeppelin.server.port    8080    Zeppelin server port        ZEPPELIN_MEM    N/A    -Xmx1024m -XX:MaxPermSize=512m    JVM mem options        ZEPPELIN_
 INTP_MEM    N/A    ZEPPELIN_MEM    JVM mem options for interpreter process        ZEPPELIN_JAVA_OPTS    N/A        JVM options        ZEPPELIN_ALLOWED_ORIGINS    zeppelin.server.allowed.origins    *    Enables a way to specify a ',' separated list of allowed origins for REST and websockets.  i.e. http://localhost:8080           N/A    zeppelin.anonymous.allowed    true    The anonymous user is allowed by default.        ZEPPELIN_SERVER_CONTEXT_PATH    zeppelin.server.context.path    /    Context path of the web application        ZEPPELIN_SSL    zeppelin.ssl    false            ZEPPELIN_SSL_CLIENT_AUTH    zeppelin.ssl.client.auth    false            ZEPPELIN_SSL_KEYSTORE_PATH    zeppelin.ssl.keystore.path    keystore            ZEPPELIN_SSL_KEYSTORE_TYPE    zeppelin.ssl.keystore.type    JKS            ZEPPELIN_SSL_KEYSTORE_PASSWORD    zeppelin.ssl.keystore.password                ZEPPELIN_SSL_KEY_MANAGER_PASSWORD    zeppelin.ssl.key.manager.password                ZEPPELIN_S
 SL_TRUSTSTORE_PATH    zeppelin.ssl.truststore.path                ZEPPELIN_SSL_TRUSTSTORE_TYPE    zeppelin.ssl.truststore.type                ZEPPELIN_SSL_TRUSTSTORE_PASSWORD    zeppelin.ssl.truststore.password                ZEPPELIN_NOTEBOOK_HOMESCREEN    zeppelin.notebook.homescreen        Display notebook IDs on the Apache Zeppelin homescreen i.e. 2A94M5J1Z        ZEPPELIN_NOTEBOOK_HOMESCREEN_HIDE    zeppelin.notebook.homescreen.hide    false    Hide the notebook ID set by ZEPPELIN_NOTEBOOK_HOMESCREEN on the Apache Zeppelin homescreen. For the further information, please read Customize your Zeppelin homepage.        ZEPPELIN_WAR_TEMPDIR    zeppelin.war.tempdir    webapps    Location of the jetty temporary directory        ZEPPELIN_NOTEBOOK_DIR    zeppelin.notebook.dir    notebook    The root directory where notebook directories are saved        ZEPPELIN_NOTEBOOK_S3_BUCKET    zeppelin.notebook.s3.bucket    zeppelin    S3 Bucket where notebook files will be saved        ZEPPELIN_N
 OTEBOOK_S3_USER    zeppelin.notebook.s3.user    user    User name of an S3 bucketi.e. bucket/user/notebook/2A94M5J1Z/note.json        ZEPPELIN_NOTEBOOK_S3_ENDPOINT    zeppelin.notebook.s3.endpoint    s3.amazonaws.com    Endpoint for the bucket        ZEPPELIN_NOTEBOOK_S3_KMS_KEY_ID    zeppelin.notebook.s3.kmsKeyID        AWS KMS Key ID to use for encrypting data in S3 (optional)        ZEPPELIN_NOTEBOOK_S3_EMP    zeppelin.notebook.s3.encryptionMaterialsProvider        Class name of a custom S3 encryption materials provider implementation to use for encrypting data in S3 (optional)        ZEPPELIN_NOTEBOOK_AZURE_CONNECTION_STRING    zeppelin.notebook.azure.connectionString        The Azure storage account connection stringi.e. DefaultEndpointsProtocol=https;AccountName=<accountName>;AccountKey=<accountKey>        ZEPPELIN_NOTEBOOK_AZURE_SHARE    zeppelin.notebook.azure.share    zeppelin    Azure Share where the notebook files will be saved        ZEPPELIN_
 NOTEBOOK_AZURE_USER    zeppelin.notebook.azure.user    user    Optional user name of an Azure file sharei.e. share/user/notebook/2A94M5J1Z/note.json        ZEPPELIN_NOTEBOOK_STORAGE    zeppelin.notebook.storage    org.apache.zeppelin.notebook.repo.VFSNotebookRepo    Comma separated list of notebook storage locations        ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC    zeppelin.notebook.one.way.sync    false    If there are multiple notebook storage locations, should we treat the first one as the only source of truth?        ZEPPELIN_INTERPRETERS    zeppelin.interpreters      org.apache.zeppelin.spark.SparkInterpreter,org.apache.zeppelin.spark.PySparkInterpreter,org.apache.zeppelin.spark.SparkSqlInterpreter,org.apache.zeppelin.spark.DepInterpreter,org.apache.zeppelin.markdown.Markdown,org.apache.zeppelin.shell.ShellInterpreter,    ...              Comma separated interpreter configurations [Class]       NOTE: This property is deprecated since Zeppelin-0.6.0 and will not be supported from Zeppeli
 n-0.7.0 on.            ZEPPELIN_INTERPRETER_DIR    zeppelin.interpreter.dir    interpreter    Interpreter directory        ZEPPELIN_WEBSOCKET_MAX_TEXT_MESSAGE_SIZE    zeppelin.websocket.max.text.message.size    1024000    Size (in characters) of the maximum text message that can be received by websocket.  ",
       "url": " /install/install.html",
       "group": "install",
-      "excerpt": "This page will help you to get started and guide you through installation of Apache Zeppelin, running it in the command line and basic configuration options."
+      "excerpt": "This page will help you get started and will guide you through installing Apache Zeppelin, running it in the command line and configuring options."
     }
     ,
     
@@ -157,6 +157,17 @@
     
   
 
+    "/interpreter/beam.html": {
+      "title": "Beam interpreter in Apache Zeppelin",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Beam interpreter for Apache ZeppelinOverviewApache Beam is an open source unified platform for data processing pipelines. A pipeline can be build using one of the Beam SDKs.The execution of the pipeline is done by different Runners. Currently, Beam supports Apache Flink Runner, Apache Spark Runner, and Google Dataflow Runner.How to useBasically, you can write normal Beam java code where you can determine the Runner. You should writ
 e the main method inside a class becuase the interpreter invoke this main to execute the pipeline. Unlike Zeppelin normal pattern, each paragraph is considered as a separate job, there isn't any relation to any other paragraph.The following is a demonstration of a word count example with data represented in array of stringsBut it can read data from files by replacing Create.of(SENTENCES).withCoder(StringUtf8Coder.of()) with TextIO.Read.from("path/to/filename.txt")%beam// most used importsimport org.apache.beam.sdk.coders.StringUtf8Coder;import org.apache.beam.sdk.transforms.Create;import java.io.Serializable;import java.util.Arrays;import java.util.List;import java.util.ArrayList;import org.apache.spark.api.java.*;import org.apache.spark.api.java.function.Function;import org.apache.spark.SparkConf;import org.apache.spark.streaming.*;import org.apache.spark.SparkContext;import org.apache.beam.runners.direct.*;import org.apache.beam.sdk.runners.*;import org.a
 pache.beam.sdk.options.*;import org.apache.beam.runners.spark.*;import org.apache.beam.runners.spark.io.ConsoleIO;import org.apache.beam.runners.flink.*;import org.apache.beam.runners.flink.examples.WordCount.Options;import org.apache.beam.sdk.Pipeline;import org.apache.beam.sdk.io.TextIO;import org.apache.beam.sdk.options.PipelineOptionsFactory;import org.apache.beam.sdk.transforms.Count;import org.apache.beam.sdk.transforms.DoFn;import org.apache.beam.sdk.transforms.MapElements;import org.apache.beam.sdk.transforms.ParDo;import org.apache.beam.sdk.transforms.SimpleFunction;import org.apache.beam.sdk.values.KV;import org.apache.beam.sdk.options.PipelineOptions;public class MinimalWordCount {  static List<String> s = new ArrayList<>();  static final String[] SENTENCES_ARRAY = new String[] {    "Hadoop is the Elephant King!",    "A yellow and elegant thing.",    "He never forgets",    "Useful d
 ata, or lets",    "An extraneous element cling!",    "A wonderful king is Hadoop.",    "The elephant plays well with Sqoop.",    "But what helps him to thrive",    "Are Impala, and Hive,",    "And HDFS in the group.",    "Hadoop is an elegant fellow.",    "An elephant gentle and mellow.",    "He never gets mad,",    "Or does anything bad,",    "Because, at his core, he is yellow",    };    static final List<String> SENTENCES = Arrays.asList(SENTENCES_ARRAY);  public static void main(String[] args) {    Options options = PipelineOptionsFactory.create().as(Options.class);    options.setRunner(FlinkRunner.class);    Pipeline p = Pipeline.create(options);    p.apply(Create.of(SENTENCES).withCoder(StringUtf8Coder.of()))         .apply("ExtractWords", Pa
 rDo.of(new DoFn<String, String>() {           @Override           public void processElement(ProcessContext c) {             for (String word : c.element().split("[^a-zA-Z']+")) {               if (!word.isEmpty()) {                 c.output(word);               }             }           }         }))        .apply(Count.<String> perElement())        .apply("FormatResults", ParDo.of(new DoFn<KV<String, Long>, String>() {          @Override          public void processElement(DoFn<KV<String, Long>, String>.ProcessContext arg0)            throws Exception {            s.add("n" + arg0.element().getKey() + "t" + arg0.element().getValue());            }        }));    p.run();    System.out.println("%table wordtcount");    for (int i = 0; i < s.size(); i++) {      System.out.print(s.get(i));    }  }}"
 ,
+      "url": " /interpreter/beam.html",
+      "group": "interpreter",
+      "excerpt": "Apache Beam is an open source, unified programming model that you can use to create a data processing pipeline."
+    }
+    ,
+    
+  
+
     "/interpreter/bigquery.html": {
       "title": "BigQuery Interpreter for Apache Zeppelin",
       "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->BigQuery Interpreter for Apache ZeppelinOverviewBigQuery is a highly scalable no-ops data warehouse in the Google Cloud Platform. Querying massive datasets can be time consuming and expensive without the right hardware and infrastructure. Google BigQuery solves this problem by enabling super-fast SQL queries against append-only tables using the processing power of Google's infrastructure. Simply move your data into BigQuery
  and let us handle the hard work. You can control access to both the project and your data based on your business needs, such as giving others the ability to view or query your data.  Configuration      Name    Default Value    Description        zeppelin.bigquery.project_id          Google Project Id        zeppelin.bigquery.wait_time    5000    Query Timeout in Milliseconds        zeppelin.bigquery.max_no_of_rows    100000    Max result set size  BigQuery APIZeppelin is built against BigQuery API version v2-rev265-1.21.0 - API JavadocsEnabling the BigQuery InterpreterIn a notebook, to enable the BigQuery interpreter, click the Gear icon and select bigquery.Setup service account credentialsIn order to run BigQuery interpreter outside of Google Cloud Engine you need to provide authentication credentials,by following this instructions:Go to the API Console Credentials pageFrom the project drop-down, select your project.On the Credentials page, select the Create credentials drop-down,
  then select Service account key.From the Service account drop-down, select an existing service account or create a new one.For Key type, select the JSON key option, then select Create. The file automatically downloads to your computer.Put the *.json file you just downloaded in a directory of your choosing. This directory must be private (you can't let anyone get access to this), but accessible to your Zeppelin instance.Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file downloaded.either though GUI: in interpreter configuration page property names in CAPITAL_CASE set up env varsor though zeppelin-env.sh: just add it to the end of the file.Using the BigQuery InterpreterIn a paragraph, use %bigquery.sql to select the BigQuery interpreter and then input SQL statements against your datasets stored in BigQuery.You can use BigQuery SQL Reference to build your own SQL.For Example, SQL to query for top 10 departure delays across airports using t
 he flights public dataset%bigquery.sqlSELECT departure_airport,count(case when departure_delay>0 then 1 else 0 end) as no_of_delays FROM [bigquery-samples:airline_ontime_data.flights] group by departure_airport order by 2 desc limit 10Another Example, SQL to query for most commonly used java packages from the github data hosted in BigQuery %bigquery.sqlSELECT  package,  COUNT(*) countFROM (  SELECT    REGEXP_EXTRACT(line, r' ([a-z0-9._]*).') package,    id  FROM (    SELECT      SPLIT(content, 'n') line,      id    FROM      [bigquery-public-data:github_repos.sample_contents]    WHERE      content CONTAINS 'import'      AND sample_path LIKE '%.java'    HAVING      LEFT(line, 6)='import' )  GROUP BY    package,    id )GROUP BY  1ORDER BY  count DESCLIMIT  40Technical descriptionFor in-depth technical details on current implementation please refer to bigquery/README.md.",
@@ -291,7 +302,7 @@
 
     "/interpreter/markdown.html": {
       "title": "Markdown Interpreter for Apache Zeppelin",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Markdown Interpreter for Apache ZeppelinOverviewMarkdown is a plain text formatting syntax designed so that it can be converted to HTML.Apache Zeppelin uses markdown4j. For more examples and extension support, please checkout here.In Zeppelin notebook, you can use %md in the beginning of a paragraph to invoke the Markdown interpreter and generate static html from Markdown plain text.In Zeppelin, Markdown interpreter is enabled by d
 efault.ExampleThe following example demonstrates the basic usage of Markdown in a Zeppelin notebook.",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Markdown Interpreter for Apache ZeppelinOverviewMarkdown is a plain text formatting syntax designed so that it can be converted to HTML.Apache Zeppelin uses markdown4j and pegdown as markdown parsers.In Zeppelin notebook, you can use %md in the beginning of a paragraph to invoke the Markdown interpreter and generate static html from Markdown plain text.In Zeppelin, Markdown interpreter is enabled by default and uses the markdown4j 
 parser.Configuration      Name    Default Value    Description        markdown.parser.type    markdown4j    Markdown Parser Type.  Available values: markdown4j, pegdown.  ExampleThe following example demonstrates the basic usage of Markdown in a Zeppelin notebook.Markdown4j Parsermarkdown4j parser provides YUML and Websequence extensions Pegdown Parserpegdown parser provides github flavored markdown.",
       "url": " /interpreter/markdown.html",
       "group": "interpreter",
       "excerpt": "Markdown is a plain text formatting syntax designed so that it can be converted to HTML. Apache Zeppelin uses markdown4j."
@@ -357,10 +368,10 @@
 
     "/interpreter/spark.html": {
       "title": "Apache Spark Interpreter for Apache Zeppelin",
-      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Spark Interpreter for Apache ZeppelinOverviewApache Spark is a fast and general-purpose cluster computing system.It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphsApache Spark is supported in Zeppelin withSpark Interpreter group, which consists of five interpreters.      Name    Class    Description        %spark    SparkInterpreter    Creates a SparkContext and 
 provides a scala environment        %pyspark    PySparkInterpreter    Provides a python environment        %r    SparkRInterpreter    Provides an R environment with SparkR support        %sql    SparkSQLInterpreter    Provides a SQL environment        %dep    DepInterpreter    Dependency loader  ConfigurationThe Spark interpreter can be configured with properties provided by Zeppelin.You can also set other Spark properties which are not listed in the table. For a list of additional properties, refer to Spark Available Properties.      Property    Default    Description        args        Spark commandline args      master    local[*]    Spark master uri.  ex) spark://masterhost:7077      spark.app.name    Zeppelin    The name of spark application.        spark.cores.max        Total number of cores to use.  Empty value uses all available core.        spark.executor.memory     1g    Executor memory per worker instance.  ex) 512m, 32g        zeppelin.dep.additionalRemoteRepository    
 spark-packages,  http://dl.bintray.com/spark-packages/maven,  false;    A list of id,remote-repository-URL,is-snapshot;  for each remote repository.        zeppelin.dep.localrepo    local-repo    Local repository for dependency loader        zeppelin.pyspark.python    python    Python command to run pyspark with        zeppelin.spark.concurrentSQL    false    Execute multiple SQL concurrently if set true.        zeppelin.spark.maxResult    1000    Max number of Spark SQL result to display.        zeppelin.spark.printREPLOutput    true    Print REPL output        zeppelin.spark.useHiveContext    true    Use HiveContext instead of SQLContext if it is true.        zeppelin.spark.importImplicit    true    Import implicits, UDF collection, and sql if set true.  Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll need to follow below two simple steps.1. Export SPARK_HOMEIn conf/zeppelin-env.sh, expor
 t SPARK_HOME environment variable with your Spark installation path.for exampleexport SPARK_HOME=/usr/lib/sparkYou can optionally export HADOOP_CONF_DIR and SPARK_SUBMIT_OPTIONSexport HADOOP_CONF_DIR=/usr/lib/hadoopexport SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0"For Windows, ensure you have winutils.exe in %HADOOP_HOME%bin. For more details please see Problems running Hadoop on Windows2. Set master in Interpreter menuAfter start Zeppelin, go to Interpreter menu and edit master property in your Spark interpreter setting. The value may vary depending on your Spark cluster deployment type.for example,local[*] in local modespark://master:7077 in standalone clusteryarn-client in Yarn client modemesos://host:5050 in Mesos clusterThat's it. Zeppelin will work with any version of Spark and any deployment type without rebuilding Zeppelin in this way. (Zeppelin 0.5.6-incubating release works up to Spark 1.6.1 )Note that without exporting S
 PARK_HOME, it's running in local mode with included version of Spark. The included version may vary depending on the build profile.SparkContext, SQLContext, ZeppelinContextSparkContext, SQLContext, ZeppelinContext are automatically created and exposed as variable names 'sc', 'sqlContext' and 'z', respectively, both in scala and python environments.Note that scala / python environment shares the same SparkContext, SQLContext, ZeppelinContext instance. Dependency ManagementThere are two ways to load external library in spark interpreter. First is using Interpreter setting menu and second is loading Spark properties.1. Setting Dependencies via Interpreter SettingPlease see Dependency Management for the details.2. Loading Spark PropertiesOnce SPARK_HOME is set in conf/zeppelin-env.sh, Zeppelin uses spark-submit as spark interpreter runner. spark-submit supports two ways to load configurations. The first is command line options such
  as --master and Zeppelin can pass these options to spark-submit by exporting SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.sh. Second is reading configuration options from SPARK_HOME/conf/spark-defaults.conf. Spark properites that user can set to distribute libraries are:      spark-defaults.conf    SPARK_SUBMIT_OPTIONS    Applicable Interpreter    Description        spark.jars    --jars    %spark    Comma-separated list of local jars to include on the driver and executor classpaths.        spark.jars.packages    --packages    %spark    Comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. Will search the local maven repo, then maven central and any additional remote repositories given by --repositories. The format for the coordinates should be groupId:artifactId:version.        spark.files    --files    %pyspark    Comma-separated list of files to be placed in the working directory of each executor.  Note that adding jar to pyspark is only
  availabe via %dep interpreter at the moment.Here are few examples:SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.shexport SPARKSUBMITOPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0 --jars /path/mylib1.jar,/path/mylib2.jar --files /path/mylib1.py,/path/mylib2.zip,/path/mylib3.egg"SPARK_HOME/conf/spark-defaults.confspark.jars        /path/mylib1.jar,/path/mylib2.jarspark.jars.packages   com.databricks:spark-csv_2.10:1.2.0spark.files       /path/mylib1.py,/path/mylib2.egg,/path/mylib3.zip3. Dynamic Dependency Loading via %dep interpreterNote: %dep interpreter is deprecated since v0.6.0.%dep interpreter load libraries to %spark and %pyspark but not to  %spark.sql interpreter so we recommend you to use first option instead.When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using %dep interpreter.Load libraries recursively from Maven repositoryLoad libraries from local filesystemAdd additional m
 aven repositoryAutomatically add libraries to SparkCluster (You can turn off)Dep interpreter leverages scala environment. So you can write any Scala code here.Note that %dep interpreter should be used before %spark, %pyspark, %sql.Here's usages.%depz.reset() // clean up previously added artifact and repository// add maven repositoryz.addRepo("RepoName").url("RepoURL")// add maven snapshot repositoryz.addRepo("RepoName").url("RepoURL").snapshot()// add credentials for private maven repositoryz.addRepo("RepoName").url("RepoURL").username("username").password("password")// add artifact from filesystemz.load("/path/to.jar")// add artifact from maven repository, with no dependencyz.load("groupId:artifactId:version").excludeAll()// add artifact recursivelyz.load("groupId:artifactId:version&q
 uot;)// add artifact recursively except comma separated GroupID:ArtifactId listz.load("groupId:artifactId:version").exclude("groupId:artifactId,groupId:artifactId, ...")// exclude with patternz.load("groupId:artifactId:version").exclude(*)z.load("groupId:artifactId:version").exclude("groupId:artifactId:*")z.load("groupId:artifactId:version").exclude("groupId:*")// local() skips adding artifact to spark clusters (skipping sc.addJar())z.load("groupId:artifactId:version").local()ZeppelinContextZeppelin automatically injects ZeppelinContext as variable 'z' in your scala/python environment. ZeppelinContext provides some additional functions and utility.Object ExchangeZeppelinContext extends map and it's shared between scala, python environment.So you can put some object from scala and read it from python, vise versa.  // P
 ut object from scala%sparkval myObject = ...z.put("objName", myObject)    # Get object from python%pysparkmyObject = z.get("objName")  Form CreationZeppelinContext provides functions for creating forms.In scala and python environments, you can create forms programmatically.  %spark/* Create text input form */z.input("formName")/* Create text input form with default value */z.input("formName", "defaultValue")/* Create select form */z.select("formName", Seq(("option1", "option1DisplayName"),                         ("option2", "option2DisplayName")))/* Create select form with default value*/z.select("formName", "option1", Seq(("option1", "option1DisplayName"),                                    ("option2", "opt
 ion2DisplayName")))    %pyspark# Create text input formz.input("formName")# Create text input form with default valuez.input("formName", "defaultValue")# Create select formz.select("formName", [("option1", "option1DisplayName"),                      ("option2", "option2DisplayName")])# Create select form with default valuez.select("formName", [("option1", "option1DisplayName"),                      ("option2", "option2DisplayName")], "option1")  In sql environment, you can create form in simple template.%sqlselect * from ${table=defaultTableName} where text like '%${search}%'To learn more about dynamic form, checkout Dynamic Form.Interpreter setting optionInterpreter setting can choose one of 'shared&am
 p;#39;, 'scoped', 'isolated' option. Spark interpreter creates separate scala compiler per each notebook but share a single SparkContext in 'scoped' mode (experimental). It creates separate SparkContext per each notebook in 'isolated' mode.Setting up Zeppelin with KerberosLogical setup with Zeppelin, Kerberos Key Distribution Center (KDC), and Spark on YARN:Configuration SetupOn the server that Zeppelin is installed, install Kerberos client modules and configuration, krb5.conf.This is to make the server communicate with KDC.Set SPARK_HOME in [ZEPPELIN_HOME]/conf/zeppelin-env.sh to use spark-submit(Additionally, you might have to set export HADOOP_CONF_DIR=/etc/hadoop/conf)Add the two properties below to spark configuration ([SPARK_HOME]/conf/spark-defaults.conf):spark.yarn.principalspark.yarn.keytabNOTE: If you do not have access to the above spark-defaults.conf file, optionally, you may add the lines to the Spark Inter
 preter through the Interpreter tab in the Zeppelin UI.That's it. Play with Zeppelin!",
+      "content"  : "<!--Licensed under the Apache License, Version 2.0 (the "License");you may not use this file except in compliance with the License.You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0Unless required by applicable law or agreed to in writing, softwaredistributed under the License is distributed on an "AS IS" BASIS,WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.See the License for the specific language governing permissions andlimitations under the License.-->Spark Interpreter for Apache ZeppelinOverviewApache Spark is a fast and general-purpose cluster computing system.It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.Apache Spark is supported in Zeppelin with Spark interpreter group which consists of below five interpreters.      Name    Class    Description        %spark    SparkInterpreter    Creates a SparkConte
 xt and provides a Scala environment        %spark.pyspark    PySparkInterpreter    Provides a Python environment        %spark.r    SparkRInterpreter    Provides an R environment with SparkR support        %spark.sql    SparkSQLInterpreter    Provides a SQL environment        %spark.dep    DepInterpreter    Dependency loader  ConfigurationThe Spark interpreter can be configured with properties provided by Zeppelin.You can also set other Spark properties which are not listed in the table. For a list of additional properties, refer to Spark Available Properties.      Property    Default    Description        args        Spark commandline args      master    local[*]    Spark master uri.  ex) spark://masterhost:7077      spark.app.name    Zeppelin    The name of spark application.        spark.cores.max        Total number of cores to use.  Empty value uses all available core.        spark.executor.memory     1g    Executor memory per worker instance.  ex) 512m, 32g        zeppelin.dep
 .additionalRemoteRepository    spark-packages,  http://dl.bintray.com/spark-packages/maven,  false;    A list of id,remote-repository-URL,is-snapshot;  for each remote repository.        zeppelin.dep.localrepo    local-repo    Local repository for dependency loader        zeppelin.pyspark.python    python    Python command to run pyspark with        zeppelin.spark.concurrentSQL    false    Execute multiple SQL concurrently if set true.        zeppelin.spark.maxResult    1000    Max number of Spark SQL result to display.        zeppelin.spark.printREPLOutput    true    Print REPL output        zeppelin.spark.useHiveContext    true    Use HiveContext instead of SQLContext if it is true.        zeppelin.spark.importImplicit    true    Import implicits, UDF collection, and sql if set true.  Without any configuration, Spark interpreter works out of box in local mode. But if you want to connect to your Spark cluster, you'll need to follow below two simple steps.1. Export SPARK_HOM
 EIn conf/zeppelin-env.sh, export SPARK_HOME environment variable with your Spark installation path.For example,export SPARK_HOME=/usr/lib/sparkYou can optionally export HADOOP_CONF_DIR and SPARK_SUBMIT_OPTIONSexport HADOOP_CONF_DIR=/usr/lib/hadoopexport SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0"For Windows, ensure you have winutils.exe in %HADOOP_HOME%bin. Please see Problems running Hadoop on Windows for the details.2. Set master in Interpreter menuAfter start Zeppelin, go to Interpreter menu and edit master property in your Spark interpreter setting. The value may vary depending on your Spark cluster deployment type.For example,local[*] in local modespark://master:7077 in standalone clusteryarn-client in Yarn client modemesos://host:5050 in Mesos clusterThat's it. Zeppelin will work with any version of Spark and any deployment type without rebuilding Zeppelin in this way. For the further information about Spark & Zeppeli
 n version compatibility, please refer to "Available Interpreters" section in Zeppelin download page.Note that without exporting SPARK_HOME, it's running in local mode with included version of Spark. The included version may vary depending on the build profile.SparkContext, SQLContext, SparkSession, ZeppelinContextSparkContext, SQLContext and ZeppelinContext are automatically created and exposed as variable names sc, sqlContext and z, respectively, in Scala, Python and R environments.Staring from 0.6.1 SparkSession is available as variable spark when you are using Spark 2.x.Note that Scala/Python/R environment shares the same SparkContext, SQLContext and ZeppelinContext instance. Dependency ManagementThere are two ways to load external libraries in Spark interpreter. First is using interpreter setting menu and second is loading Spark properties.1. Setting Dependencies via Interpreter SettingPlease see Dependency Management for the details.2. Loading Spark Pr
 opertiesOnce SPARK_HOME is set in conf/zeppelin-env.sh, Zeppelin uses spark-submit as spark interpreter runner. spark-submit supports two ways to load configurations. The first is command line options such as --master and Zeppelin can pass these options to spark-submit by exporting SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.sh. Second is reading configuration options from SPARK_HOME/conf/spark-defaults.conf. Spark properties that user can set to distribute libraries are:      spark-defaults.conf    SPARK_SUBMIT_OPTIONS    Description        spark.jars    --jars    Comma-separated list of local jars to include on the driver and executor classpaths.        spark.jars.packages    --packages    Comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. Will search the local maven repo, then maven central and any additional remote repositories given by --repositories. The format for the coordinates should be groupId:artifactId:version.        spark
 .files    --files    Comma-separated list of files to be placed in the working directory of each executor.  Here are few examples:SPARK_SUBMIT_OPTIONS in conf/zeppelin-env.shexport SPARK_SUBMIT_OPTIONS="--packages com.databricks:spark-csv_2.10:1.2.0 --jars /path/mylib1.jar,/path/mylib2.jar --files /path/mylib1.py,/path/mylib2.zip,/path/mylib3.egg"SPARK_HOME/conf/spark-defaults.confspark.jars        /path/mylib1.jar,/path/mylib2.jarspark.jars.packages   com.databricks:spark-csv_2.10:1.2.0spark.files       /path/mylib1.py,/path/mylib2.egg,/path/mylib3.zip3. Dynamic Dependency Loading via %spark.dep interpreterNote: %spark.dep interpreter is deprecated since v0.6.0.%spark.dep interpreter loads libraries to %spark and %spark.pyspark but not to  %spark.sql interpreter. So we recommend you to use the first option instead.When your code requires external library, instead of doing download/copy/restart Zeppelin, you can easily do following jobs using %spark.dep interpreter
 .Load libraries recursively from maven repositoryLoad libraries from local filesystemAdd additional maven repositoryAutomatically add libraries to SparkCluster (You can turn off)Dep interpreter leverages Scala environment. So you can write any Scala code here.Note that %spark.dep interpreter should be used before %spark, %spark.pyspark, %spark.sql.Here's usages.%spark.depz.reset() // clean up previously added artifact and repository// add maven repositoryz.addRepo("RepoName").url("RepoURL")// add maven snapshot repositoryz.addRepo("RepoName").url("RepoURL").snapshot()// add credentials for private maven repositoryz.addRepo("RepoName").url("RepoURL").username("username").password("password")// add artifact from filesystemz.load("/path/to.jar")// add artifact from maven repository, with no dependencyz.load("g
 roupId:artifactId:version").excludeAll()// add artifact recursivelyz.load("groupId:artifactId:version")// add artifact recursively except comma separated GroupID:ArtifactId listz.load("groupId:artifactId:version").exclude("groupId:artifactId,groupId:artifactId, ...")// exclude with patternz.load("groupId:artifactId:version").exclude(*)z.load("groupId:artifactId:version").exclude("groupId:artifactId:*")z.load("groupId:artifactId:version").exclude("groupId:*")// local() skips adding artifact to spark clusters (skipping sc.addJar())z.load("groupId:artifactId:version").local()ZeppelinContextZeppelin automatically injects ZeppelinContext as variable z in your Scala/Python environment. ZeppelinContext provides some additional functions and utilities.Object ExchangeZeppelinContext extends map and it's shared betwe
 en Scala and Python environment.So you can put some objects from Scala and read it from Python, vice versa.  // Put object from scala%sparkval myObject = ...z.put("objName", myObject)    # Get object from python%spark.pysparkmyObject = z.get("objName")  Form CreationZeppelinContext provides functions for creating forms.In Scala and Python environments, you can create forms programmatically.  %spark/* Create text input form */z.input("formName")/* Create text input form with default value */z.input("formName", "defaultValue")/* Create select form */z.select("formName", Seq(("option1", "option1DisplayName"),                         ("option2", "option2DisplayName")))/* Create select form with default value*/z.select("formName", "option1", Seq(("option1", 
 "option1DisplayName"),                                    ("option2", "option2DisplayName")))    %spark.pyspark# Create text input formz.input("formName")# Create text input form with default valuez.input("formName", "defaultValue")# Create select formz.select("formName", [("option1", "option1DisplayName"),                      ("option2", "option2DisplayName")])# Create select form with default valuez.select("formName", [("option1", "option1DisplayName"),                      ("option2", "option2DisplayName")], "option1")  In sql environment, you can create form in simple template.%spark.sqlselect * from ${table=defaultTableName} where text like '%${search}%'To lear
 n more about dynamic form, checkout Dynamic Form.Interpreter setting optionYou can choose one of shared, scoped and isolated options wheh you configure Spark interpreter. Spark interpreter creates separated Scala compiler per each notebook but share a single SparkContext in scoped mode (experimental). It creates separated SparkContext per each notebook in isolated mode.Setting up Zeppelin with KerberosLogical setup with Zeppelin, Kerberos Key Distribution Center (KDC), and Spark on YARN:Configuration SetupOn the server that Zeppelin is installed, install Kerberos client modules and configuration, krb5.conf.This is to make the server communicate with KDC.Set SPARK_HOME in [ZEPPELIN_HOME]/conf/zeppelin-env.sh to use spark-submit(Additionally, you might have to set export HADOOP_CONF_DIR=/etc/hadoop/conf)Add the two properties below to Spark configuration ([SPARK_HOME]/conf/spark-defaults.conf):spark.yarn.principalspark.yarn.keytabNOTE: If you do not have permission to access for the a
 bove spark-defaults.conf file, optionally, you can add the above lines to the Spark Interpreter setting through the Interpreter tab in the Zeppelin UI.That's it. Play with Zeppelin!",
       "url": " /interpreter/spark.html",
       "group": "interpreter",
-      "excerpt": "Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs."
+      "excerpt": "Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution engine."
     }
     ,
     
@@ -512,7 +523,7 @@
 
     "/rss.xml": {
       "title": "RSS Feed",
-      "content"  : "        Apache Zeppelin        Apache Zeppelin - The Apache Software Foundation        http://zeppelin.apache.org        http://zeppelin.apache.org        2016-09-12T06:27:44+02:00        2016-09-12T06:27:44+02:00        1800",
+      "content"  : "        Apache Zeppelin        Apache Zeppelin - The Apache Software Foundation        http://zeppelin.apache.org        http://zeppelin.apache.org        2016-09-26T20:27:17-07:00        2016-09-26T20:27:17-07:00        1800",
       "url": " /rss.xml",
       "group": "",
       "excerpt": ""



Mime
View raw message