From commits-return-5075-archive-asf-public=cust-asf.ponee.io@zeppelin.apache.org Wed Jan 23 05:28:14 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 3E73D1807BA for ; Wed, 23 Jan 2019 05:28:12 +0100 (CET) Received: (qmail 87684 invoked by uid 500); 23 Jan 2019 04:28:11 -0000 Mailing-List: contact commits-help@zeppelin.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zeppelin.apache.org Delivered-To: mailing list commits@zeppelin.apache.org Received: (qmail 86588 invoked by uid 99); 23 Jan 2019 04:28:09 -0000 Received: from Unknown (HELO svn01-us-west.apache.org) (209.188.14.144) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Jan 2019 04:28:09 +0000 Received: from svn01-us-west.apache.org (localhost [127.0.0.1]) by svn01-us-west.apache.org (ASF Mail Server at svn01-us-west.apache.org) with ESMTP id 937D63A2D0F for ; Wed, 23 Jan 2019 04:28:08 +0000 (UTC) Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Subject: svn commit: r1851877 [17/46] - in /zeppelin/site/docs: 0.8.1/ 0.8.1/assets/ 0.8.1/assets/themes/ 0.8.1/assets/themes/zeppelin/ 0.8.1/assets/themes/zeppelin/bootstrap/ 0.8.1/assets/themes/zeppelin/bootstrap/css/ 0.8.1/assets/themes/zeppelin/bootstrap/fo... Date: Wed, 23 Jan 2019 04:28:05 -0000 To: commits@zeppelin.apache.org From: zjffdu@apache.org X-Mailer: svnmailer-1.0.9 Message-Id: <20190123042808.937D63A2D0F@svn01-us-west.apache.org> Added: zeppelin/site/docs/0.8.1/interpreter/lens.html URL: http://svn.apache.org/viewvc/zeppelin/site/docs/0.8.1/interpreter/lens.html?rev=1851877&view=auto ============================================================================== --- zeppelin/site/docs/0.8.1/interpreter/lens.html (added) +++ zeppelin/site/docs/0.8.1/interpreter/lens.html Wed Jan 23 04:28:00 2019 @@ -0,0 +1,419 @@ + + + + + + Apache Zeppelin 0.8.0 Documentation: Lens Interpreter for Apache Zeppelin + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + + +
+
+ + +

Lens Interpreter for Apache Zeppelin

+ +
+ +

Overview

+ +

Apache Lens provides an Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores and optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one.

+ +

Apache Lens

+ +

Installing and Running Lens

+ +

In order to use Lens interpreters, you may install Apache Lens in some simple steps:

+ +
    +
  1. Download Lens for latest version from the ASF. Or the older release can be found in the Archives.
  2. +
  3. Before running Lens, you have to set HIVEHOME and HADOOPHOME. If you want to get more information about this, please refer to here. Lens also provides Pseudo Distributed mode. Lens pseudo-distributed setup is done by using docker. Hive server and hadoop daemons are run as separate processes in lens pseudo-distributed setup.
  4. +
  5. Now, you can start lens server (or stop).
  6. +
+
./bin/lens-ctl start # (or stop)
+
+

Configuring Lens Interpreter

+ +

At the "Interpreters" menu, you can edit Lens interpreter or create new one. Zeppelin provides these properties for Lens.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Property NamevalueDescription
lens.client.dbnamedefaultThe database schema name
lens.query.enable.persistent.resultsetfalseWhether to enable persistent resultset for queries. When enabled, server will fetch results from driver, custom format them if any and store in a configured location. The file name of query output is queryhandle-id, with configured extensions
lens.server.base.urlhttp://hostname:port/lensapiThe base url for the lens server. you have to edit "hostname" and "port" that you may use(ex. http://0.0.0.0:9999/lensapi)
lens.session.cluster.user defaultHadoop cluster username
zeppelin.lens.maxResult1000Max number of rows to display
zeppelin.lens.maxThreads10If concurrency is true then how many threads?
zeppelin.lens.run.concurrenttrueRun concurrent Lens Sessions
xxxyyyanything else from [Configuring lens server](https://lens.apache.org/admin/config-server.html)
+ +

Apache Lens Interpreter Setting

+ +

Interpreter Binding for Zeppelin Notebook

+ +

After configuring Lens interpreter, create your own notebook, then you can bind interpreters like below image.

+ +

Zeppelin Notebook Interpreter Binding

+ +

For more interpreter binding information see here.

+ +

How to use

+ +

You can analyze your data by using OLAP Cube QL which is a high level SQL like language to query and describe data sets organized in data cubes. +You may experience OLAP Cube like this Video tutorial. +As you can see in this video, they are using Lens Client Shell(./bin/lens-cli.sh). All of these functions also can be used on Zeppelin by using Lens interpreter.

+ +

  • Create and Use (Switch) Databases.

    +
    create database newDb
    +
    use newDb
    +
    +

  • Create Storage.

    +
    create storage your/path/to/lens/client/examples/resources/db-storage.xml
    +
    +

  • Create Dimensions, Show fields and join-chains of them.

    +
    create dimension your/path/to/lens/client/examples/resources/customer.xml
    +
    dimension show fields customer
    +
    dimension show joinchains customer
    +
    +

  • Create Caches, Show fields and join-chains of them.

    +
    create cube your/path/to/lens/client/examples/resources/sales-cube.xml
    +
    cube show fields sales
    +
    cube show joinchains sales
    +
    +

  • Create Dimtables and Fact.

    +
    create dimtable your/path/to/lens/client/examples/resources/customer_table.xml
    +
    create fact your/path/to/lens/client/examples/resources/sales-raw-fact.xml
    +
    +

  • Add partitions to Dimtable and Fact.

    +
    dimtable add single-partition --dimtable_name customer_table --storage_name local 
    +--path your/path/to/lens/client/examples/resources/customer-local-part.xml
    +
    fact add partitions --fact_name sales_raw_fact --storage_name local 
    +--path your/path/to/lens/client/examples/resources/sales-raw-local-parts.xml
    +
    +

  • Now, you can run queries on cubes.

    +
    query execute cube select customer_city_name, product_details.description, 
    +product_details.category, product_details.color, store_sales from sales 
    +where time_range_in(delivery_time, '2015-04-11-00', '2015-04-13-00')
    +
    +

    Lens Query Result

    + +

    These are just examples that provided in advance by Lens. If you want to explore whole tutorials of Lens, see the tutorial video.

    + +

    Lens UI Service

    + +

    Lens also provides web UI service. Once the server starts up, you can open the service on http://serverhost:19999/index.html and browse. You may also check the structure that you made and use query easily here.

    + +

    Lens UI Service

    + +
  • +
    + + +
    +
    + +
    +
    + + + + + + + + + + + Added: zeppelin/site/docs/0.8.1/interpreter/livy.html URL: http://svn.apache.org/viewvc/zeppelin/site/docs/0.8.1/interpreter/livy.html?rev=1851877&view=auto ============================================================================== --- zeppelin/site/docs/0.8.1/interpreter/livy.html (added) +++ zeppelin/site/docs/0.8.1/interpreter/livy.html Wed Jan 23 04:28:00 2019 @@ -0,0 +1,514 @@ + + + + + + Apache Zeppelin 0.8.0 Documentation: Livy Interpreter for Apache Zeppelin + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + +
    +
    + + +

    Livy Interpreter for Apache Zeppelin

    + +
    + +

    Overview

    + +

    Livy is an open source REST interface for interacting with Spark from anywhere. It supports executing snippets of code or programs in a Spark context that runs locally or in YARN.

    + +
      +
    • Interactive Scala, Python and R shells
    • +
    • Batch submissions in Scala, Java, Python
    • +
    • Multi users can share the same server (impersonation support)
    • +
    • Can be used for submitting jobs from anywhere with REST
    • +
    • Does not require any code change to your programs
    • +
    + +

    Requirements

    + +

    Additional requirements for the Livy interpreter are:

    + +
      +
    • Spark 1.3 or above.
    • +
    • Livy server.
    • +
    + +

    Configuration

    + +

    We added some common configurations for spark, and you can set any configuration you want. +You can find all Spark configurations in here. +And instead of starting property with spark. it should be replaced with livy.spark.. +Example: spark.driver.memory to livy.spark.driver.memory

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    PropertyDefaultDescription
    zeppelin.livy.urlhttp://localhost:8998URL where livy server is running
    zeppelin.livy.spark.sql.maxResult1000Max number of Spark SQL result to display.
    zeppelin.livy.spark.sql.field.truncatetrueWhether to truncate field values longer than 20 characters or not
    zeppelin.livy.session.create_timeout120Timeout in seconds for session creation
    zeppelin.livy.displayAppInfotrueWhether to display app info
    zeppelin.livy.pull_status.interval.millis1000The interval for checking paragraph execution status
    livy.spark.driver.coresDriver cores. ex) 1, 2.
    livy.spark.driver.memoryDriver memory. ex) 512m, 32g.
    livy.spark.executor.instancesExecutor instances. ex) 1, 4.
    livy.spark.executor.coresNum cores per executor. ex) 1, 4.
    livy.spark.executor.memoryExecutor memory per worker instance. ex) 512m, 32g.
    livy.spark.dynamicAllocation.enabledUse dynamic resource allocation. ex) True, False.
    livy.spark.dynamicAllocation.cachedExecutorIdleTimeoutRemove an executor which has cached data blocks.
    livy.spark.dynamicAllocation.minExecutorsLower bound for the number of executors.
    livy.spark.dynamicAllocation.initialExecutorsInitial number of executors to run.
    livy.spark.dynamicAllocation.maxExecutorsUpper bound for the number of executors.
    livy.spark.jars.packagesAdding extra libraries to livy interpreter
    zeppelin.livy.ssl.trustStoreclient trustStore file. Used when livy ssl is enabled
    zeppelin.livy.ssl.trustStorePasswordpassword for trustStore file. Used when livy ssl is enabled
    zeppelin.livy.http.headerskey_1: value_1; key_2: value_2custom http headers when calling livy rest api. Each http header is separated by `;`, and each header is one key value pair where key value is separated by `:`
    + +

    We remove livy.spark.master in zeppelin-0.7. Because we sugguest user to use livy 0.3 in zeppelin-0.7. And livy 0.3 don't allow to specify livy.spark.master, it enfornce yarn-cluster mode.

    + +

    Adding External libraries

    + +

    You can load dynamic library to livy interpreter by set livy.spark.jars.packages property to comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. The format for the coordinates should be groupId:artifactId:version.

    + +

    Example

    + + + + + + + + + + + + +
    PropertyExampleDescription
    livy.spark.jars.packagesio.spray:spray-json_2.10:1.3.1Adding extra libraries to livy interpreter
    + +

    How to use

    + +

    Basically, you can use

    + +

    spark

    +
    %livy.spark
    +sc.version
    +
    +

    pyspark

    +
    %livy.pyspark
    +print "1"
    +
    +

    sparkR

    +
    %livy.sparkr
    +hello <- function( name ) {
    +    sprintf( "Hello, %s", name );
    +}
    +
    +hello("livy")
    +
    +

    Impersonation

    + +

    When Zeppelin server is running with authentication enabled, +then this interpreter utilizes Livy’s user impersonation feature +i.e. sends extra parameter for creating and running a session ("proxyUser": "${loggedInUser}"). +This is particularly useful when multi users are sharing a Notebook server.

    + +

    Apply Zeppelin Dynamic Forms

    + +

    You can leverage Zeppelin Dynamic Form. Form templates is only avalible for livy sql interpreter.

    +
    %livy.sql
    +select * from products where ${product_id=1}
    +
    +

    And creating dynamic formst programmatically is not feasible in livy interpreter, because ZeppelinContext is not available in livy interpreter.

    + +

    Shared SparkContext

    + +

    Starting from livy 0.5 which is supported by Zeppelin 0.8.0, SparkContext is shared between scala, python, r and sql. +That means you can query the table via %livy.sql when this table is registered in %livy.spark, %livy.pyspark, $livy.sparkr.

    + +

    FAQ

    + +

    Livy debugging: If you see any of these in error console

    + +
    +

    Connect to livyhost:8998 [livyhost/127.0.0.1, livyhost/0:0:0:0:0:0:0:1] failed: Connection refused

    +
    + +

    Looks like the livy server is not up yet or the config is wrong

    + +
    +

    Exception: Session not found, Livy server would have restarted, or lost session.

    +
    + +

    The session would have timed out, you may need to restart the interpreter.

    + +
    +

    Blacklisted configuration values in session config: spark.master

    +
    + +

    Edit conf/spark-blacklist.conf file in livy server and comment out #spark.master line.

    + +

    If you choose to work on livy in apps/spark/java directory in https://github.com/cloudera/hue, +copy spark-user-configurable-options.template to spark-user-configurable-options.conf file in livy server and comment out #spark.master.

    + +
    +
    + + +
    +
    + +
    +
    + + + + + + + + + + +