zeppelin-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zjf...@apache.org
Subject [zeppelin] branch master updated: ZEPPELIN-4437. Update python document
Date Mon, 23 Dec 2019 06:03:14 GMT
This is an automated email from the ASF dual-hosted git repository.

zjffdu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/zeppelin.git


The following commit(s) were added to refs/heads/master by this push:
     new 1a6bce6  ZEPPELIN-4437. Update python document
1a6bce6 is described below

commit 1a6bce627abfcd3d6ce0100665f2b444ae2d1fcc
Author: Jeff Zhang <zjffdu@apache.org>
AuthorDate: Fri Nov 8 17:22:43 2019 +0800

    ZEPPELIN-4437. Update python document
    
    ### What is this PR for?
    
    This PR is to polish the python interpreter document.
    
    ### What type of PR is it?
    [Documentation]
    
    ### Todos
    * [ ] - Task
    
    ### What is the Jira issue?
    * https://issues.apache.org/jira/browse/ZEPPELIN-4437
    
    ### How should this be tested?
    * CI pass
    
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? No
    * Is there breaking changes for older versions? No
    * Does this needs documentation? No
    
    Author: Jeff Zhang <zjffdu@apache.org>
    
    Closes #3538 from zjffdu/ZEPPELIN-4437 and squashes the following commits:
    
    48163d089 [Jeff Zhang] ZEPPELIN-4437. Update python document
---
 .../img/docs-img/ipython_code_completion.png       | Bin 0 -> 56915 bytes
 .../themes/zeppelin/img/docs-img/ipython_error.png | Bin 0 -> 57506 bytes
 .../zeppelin/img/docs-img/ipython_hvplot.png       | Bin 0 -> 293938 bytes
 docs/interpreter/python.md                         | 462 +++++++++++++--------
 .../src/main/resources/python/zeppelin_context.py  |  12 +-
 5 files changed, 296 insertions(+), 178 deletions(-)

diff --git a/docs/assets/themes/zeppelin/img/docs-img/ipython_code_completion.png b/docs/assets/themes/zeppelin/img/docs-img/ipython_code_completion.png
new file mode 100644
index 0000000..75a642f
Binary files /dev/null and b/docs/assets/themes/zeppelin/img/docs-img/ipython_code_completion.png
differ
diff --git a/docs/assets/themes/zeppelin/img/docs-img/ipython_error.png b/docs/assets/themes/zeppelin/img/docs-img/ipython_error.png
new file mode 100644
index 0000000..8747969
Binary files /dev/null and b/docs/assets/themes/zeppelin/img/docs-img/ipython_error.png differ
diff --git a/docs/assets/themes/zeppelin/img/docs-img/ipython_hvplot.png b/docs/assets/themes/zeppelin/img/docs-img/ipython_hvplot.png
new file mode 100644
index 0000000..b5b6dfe
Binary files /dev/null and b/docs/assets/themes/zeppelin/img/docs-img/ipython_hvplot.png differ
diff --git a/docs/interpreter/python.md b/docs/interpreter/python.md
index 82280ac..6bb7f29 100644
--- a/docs/interpreter/python.md
+++ b/docs/interpreter/python.md
@@ -23,6 +23,34 @@ limitations under the License.
 
 <div id="toc"></div>
 
+## Overview
+
+Zeppelin supports python language which is very popular in data analytics and machine learning.
+
+<table class="table-configuration">
+  <tr>
+    <th>Name</th>
+    <th>Class</th>
+    <th>Description</th>
+  </tr>
+  <tr>
+    <td>%python</td>
+    <td>PythonInterpreter</td>
+    <td>Vanilla python interpreter, with least dependencies, only python environment
installed is required</td>
+  </tr>
+  <tr>
+    <td>%python.ipython</td>
+    <td>IPythonInterpreter</td>
+    <td>Provide more fancy python runtime via IPython, almost the same experience like
Jupyter. It requires more things, but is the recommended interpreter for using python in Zeppelin,
see below</td>
+  </tr>
+  <tr>
+    <td>%python.sql</td>
+    <td>PythonInterpreterPandasSql</td>
+    <td>Provide sql capability to query data in Pandas DataFrame via <code>pandasql</code></td>
+  </tr>
+</table>
+
+
 ## Configuration
 <table class="table-configuration">
   <tr>
@@ -33,8 +61,8 @@ limitations under the License.
   <tr>
     <td>zeppelin.python</td>
     <td>python</td>
-    <td>Path of the already installed Python binary (could be python2 or python3).
-    If python is not in your $PATH you can set the absolute directory (example : /usr/bin/python)
+    <td>Path of the installed Python binary (could be python2 or python3).
+    You should set this property explicitly if python is not in your <code>$PATH</code>(example:
/usr/bin/python).
     </td>
   </tr>
   <tr>
@@ -42,139 +70,35 @@ limitations under the License.
     <td>1000</td>
     <td>Max number of dataframe rows to display.</td>
   </tr>
+  <tr>
+    <td>zeppelin.python.useIPython</td>
+    <td>true</td>
+    <td>When this property is true, <code>%python</code> would be delegated
to <code>%python.ipython</code> if IPython is available, otherwise
+    IPython is only used in <code>%python.ipython</code>.
+    </td>
+  </tr>
 </table>
 
-## Enabling Python Interpreter
-
-In a notebook, to enable the **Python** interpreter, click on the **Gear** icon and select
**Python**
-
-## Using the Python Interpreter
-
-In a paragraph, use **_%python_** to select the **Python** interpreter and then input all
commands.
-
-The interpreter can only work if you already have python installed (the interpreter doesn't
bring it own python binaries).
-
-To access the help, type **help()**
-
-## Python environments
-
-### Default
-By default, PythonInterpreter will use python command defined in `zeppelin.python` property
to run python process.
-The interpreter can use all modules already installed (with pip, easy_install...)
-
-### Conda
-[Conda](http://conda.pydata.org/) is an package management system and environment management
system for python.
-`%python.conda` interpreter lets you change between environments.
-
-#### Usage
-
-- get the Conda Infomation: 
-
-    ```
-    %python.conda info
-    ```
-    
-- list the Conda environments: 
-
-    ```
-    %python.conda env list
-    ```
-
-- create a conda enviornment: 
-
-    ```
-    %python.conda create --name [ENV NAME]
-    ```
-    
-- activate an environment (python interpreter will be restarted): 
-
-    ```
-    %python.conda activate [ENV NAME]
-    ```
-
-- deactivate
-
-    ```
-    %python.conda deactivate
-    ```
-    
-- get installed package list inside the current environment
-
-    ```
-    %python.conda list
-    ```
-    
-- install package
-
-    ```
-    %python.conda install [PACKAGE NAME]
-    ```
-  
-- uninstall package
-  
-    ```
-    %python.conda uninstall [PACKAGE NAME]
-    ```
-
-### Docker
-
-`%python.docker` interpreter allows PythonInterpreter creates python process in a specified
docker container.
 
-#### Usage
+## Vanilla Python Interpreter (`%python`)
 
-- activate an environment
-
-    ```
-    %python.docker activate [Repository]
-    %python.docker activate [Repository:Tag]
-    %python.docker activate [Image Id]
-    ```
-
-- deactivate
-
-    ```
-    %python.docker deactivate
-    ```
-
-<br/>
-Here is an example
-
-```
-# activate latest tensorflow image as a python environment
-%python.docker activate gcr.io/tensorflow/tensorflow:latest
-```
-
-## Using Zeppelin Dynamic Forms
-You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/usage/dynamic_form/intro.html) inside
your Python code.
-
-**Zeppelin Dynamic Form can only be used if py4j Python library is installed in your system.
If not, you can install it with `pip install py4j`.**
-
-Example : 
-
-```python
-%python
-### Input form
-print (z.input("f1","defaultValue"))
-
-### Select form
-print (z.select("f1",[("o1","1"),("o2","2")],"2"))
-
-### Checkbox form
-print("".join(z.checkbox("f3", [("o1","1"), ("o2","2")],["1"])))
-```
+The vanilla python interpreter provides basic python interpreter feature, only python installed
is required.
 
-## Matplotlib integration
+### Matplotlib integration
 
- The python interpreter can display matplotlib figures inline automatically using the `pyplot`
module:
+The vanilla python interpreter can display matplotlib figures inline automatically using
the `matplotlib`:
  
 ```python
 %python
+
 import matplotlib.pyplot as plt
 plt.plot([1, 2, 3])
 ```
-This is the recommended method for using matplotlib from within a Zeppelin notebook. The
output of this command will by default be converted to HTML by implicitly making use of the
`%html` magic. Additional configuration can be achieved using the builtin `z.configure_mpl()`
method. For example, 
+
+The output of this command will by default be converted to HTML by implicitly making use
of the `%html` magic. Additional configuration can be achieved using the builtin `z.configure_mpl()`
method. For example, 
 
 ```python
+
 z.configure_mpl(width=400, height=300, fmt='svg')
 plt.plot([1, 2, 3])
 ```
@@ -191,6 +115,7 @@ If you are unable to load the inline backend, use `z.show(plt)`:
 
 ```python
 %python
+
 import matplotlib.pyplot as plt
 plt.figure()
 (.. ..)
@@ -201,20 +126,88 @@ The `z.show()` function can take optional parameters to adapt graph
dimensions (
 
  ```python
 %python
+
 z.show(plt, width='50px')
 z.show(plt, height='150px', fmt='svg')
 ```
 <img class="img-responsive" src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/pythonMatplotlib.png"
/>
 
 
+
+## IPython Interpreter (`%python.ipython`) (recommended)
+
+IPython is more powerful than the vanilla python interpreter with extra functionality. You
can use IPython with Python2 or Python3 which depends on which python you set in `zeppelin.python`.
+
+For non-anaconda environment 
+
+   **Prerequisites**
+   
+    - Jupyter `pip install jupyter`
+    - grpcio `pip install grpcio`
+    - protobuf `pip install protobuf`
+
+For anaconda environment (`zeppelin.python` points to the python under anaconda)
+
+   **Prerequisites**
+   
+    - grpcio `pip install grpcio`
+    - protobuf `pip install protobuf`
+
+In addition to all the basic functions of the vanilla python interpreter, you can use all
the IPython advanced features as you use it in Jupyter Notebook.
+
+e.g. 
+
+### Use IPython magic
+
+```
+%python.ipython
+
+#python help
+range?
+
+#timeit
+%timeit range(100)
+```
+
+### Use matplotlib 
+
+```
+%python.ipython
+
+%matplotlib inline
+import matplotlib.pyplot as plt
+
+print("hello world")
+data=[1,2,3,4]
+plt.figure()
+plt.plot(data)
+```
+
+### Colored text output
+
+<img class="img-responsive" src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/ipython_error.png"
/>
+
+### More types of visualization
+e.g. IPython supports hvplot
+<img class="img-responsive" src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/ipython_hvplot.png"
/>
+
+### Better code completion
+<img class="img-responsive" src="{{BASE_PATH}}/assets/themes/zeppelin/img/docs-img/ipython_code_completion.png"
/>
+
+
+By default, Zeppelin would use IPython in `%python` if IPython prerequisites are meet, otherwise
it would use vanilla Python interpreter in `%python`.
+If you don't want to use IPython via `%python`, then you can set `zeppelin.python.useIPython`
as `false` in interpreter setting.
+
+
 ## Pandas integration
 Apache Zeppelin [Table Display System](../usage/display_system/basic.html#table) provides
built-in data visualization capabilities. 
-Python interpreter leverages it to visualize Pandas DataFrames though similar `z.show()`
API, 
-same as with [Matplotlib integration](#matplotlib-integration).
+Python interpreter leverages it to visualize Pandas DataFrames though similar `z.show()`
API, same as with [Matplotlib integration](#matplotlib-integration).
 
 Example:
 
 ```python
+%python
+
 import pandas as pd
 rates = pd.read_csv("bank.csv", sep=";")
 z.show(rates)
@@ -226,16 +219,18 @@ There is a convenience `%python.sql` interpreter that matches Apache
Spark exper
 enables usage of SQL language to query [Pandas DataFrames](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html)
and 
 visualization of results though built-in [Table Display System](../usage/display_system/basic.html#table).
 
- **Pre-requests**
+ **Prerequisites**
 
   - Pandas `pip install pandas`
   - PandaSQL `pip install -U pandasql`
 
-In case default binded interpreter is Python (first in the interpreter list, under the _Gear
Icon_), you can just use it as `%sql` i.e
+Here's one example:
 
  - first paragraph
 
   ```python
+%python
+
 import pandas as pd
 rates = pd.read_csv("bank.csv", sep=";")
   ```
@@ -243,88 +238,211 @@ rates = pd.read_csv("bank.csv", sep=";")
  - next paragraph
 
   ```sql
-%sql
+%python.sql
+
 SELECT * FROM rates WHERE age < 40
   ```
 
-Otherwise it can be referred to as `%python.sql`
 
+## Using Zeppelin Dynamic Forms
+You can leverage [Zeppelin Dynamic Form]({{BASE_PATH}}/usage/dynamic_form/intro.html) inside
your Python code.
 
-## IPython Support
+Example : 
 
-IPython is more powerful than the default python interpreter with extra functionality. You
can use IPython with Python2 or Python3 which depends on which python you set `zeppelin.python`.
+```python
+%python
 
-   **Pre-requests**
-   
-    - Jupyter `pip install jupyter`
-    - grpcio `pip install grpcio`
-    - protobuf `pip install protobuf`
+### Input form
+print(z.input("f1","defaultValue"))
 
-If you already install anaconda, then you just need to install `grpcio` as Jupyter is already
included in anaconda. For grpcio version >= 1.12.0 you'll also need to install protobuf
separately.
+### Select form
+print(z.select("f2",[("o1","1"),("o2","2")],"o1"))
 
-In addition to all basic functions of the python interpreter, you can use all the IPython
advanced features as you use it in Jupyter Notebook.
+### Checkbox form
+print("".join(z.checkbox("f3", [("o1","1"), ("o2","2")],["o1"])))
+```
 
-e.g. 
+## ZeppelinContext API
 
-Use IPython magic
+Python interpreter create a variable `z` which represent `ZeppelinContext` for you. User
can use it to do more fancy and complex things in Zeppelin.
 
-```
-%python.ipython
+<table class="table-configuration">
+  <tr>
+    <th>API</th>
+    <th>Description</th>
+  </tr>
+  <tr>
+    <td>z.put(key, value)</td>
+    <td>Put object <code>value</code> with identifier <code>key</code>
to distributed resource pool of Zeppelin, 
+    so that it can be used by other interpreters</td>
+  </tr>
+  <tr>
+    <td>z.get(key)</td>
+    <td>Get object with identifier <code>key</code> from distributed resource
pool of Zeppelin</td>
+  </tr>
+  <tr>
+    <td>z.remove(key)</td>
+    <td>Remove object with identifier <code>key</code> from distributed
resource pool of Zeppelin</td>
+  </tr>
+  <tr>
+    <td>z.getAsDataFrame(key)</td>
+    <td>Get object with identifier <code>key</code> from distributed resource
pool of Zeppelin and converted into pandas dataframe.
+    The object in the distributed resource pool must be table type, e.g. jdbc interpreter
result.
+    </td>
+  </tr>
+  <tr>
+    <td>z.angular(name, noteId = None, paragraphId = None)</td>
+    <td>Get the angular object with identifier <code>name</code></td>
+  </tr>
+  <tr>
+    <td>z.angularBind(name, value, noteId = None, paragraphId = None)</td>
+    <td>Bind value to angular object with identifier <code>name</code></td>
+  </tr>
+  <tr>
+    <td>z.angularUnbind(name, noteId = None)</td>
+    <td>Unbind value from angular object with identifier <code>name</code></td>
+  </tr>
+  <tr>
+    <td>z.show(p)</td>
+    <td>Show python object <code>p</code> in Zeppelin, if it is pandas
dataframe, it would be displayed in Zeppelin's table format, 
+    others will be converted to string</td>
+  </tr>  
+  <tr>
+    <td>z.textbox(name, defaultValue="")</td>
+    <td>Create dynamic form Textbox <code>name</code> with defaultValue</td>
+  </tr>
+  <tr>
+    <td>z.select(name, options, defaultValue="")</td>
+    <td>Create dynamic form Select <code>name</code> with options and defaultValue.
options should be a list of Tuple(first element is key, 
+    the second element is the displayed value) e.g. <code>z.select("f2",[("o1","1"),("o2","2")],"o1")</code></td>
+  </tr>
+  <tr>
+    <td>z.checkbox(name, options, defaultChecked=[])</td>
+    <td>Create dynamic form Checkbox `name` with options and defaultChecked. options
should be a list of Tuple(first element is key, 
+    the second element is the displayed value) e.g. <code>z.checkbox("f3", [("o1","1"),
("o2","2")],["o1"])</code></td>
+  </tr>
+  <tr>
+    <td>z.noteTextbox(name, defaultValue="")</td>
+    <td>Create note level dynamic form Textbox</td>
+  </tr>
+  <tr>
+    <td>z.noteSelect(name, options, defaultValue="")</td>
+    <td>Create note level dynamic form Select</td>
+  </tr>
+  <tr>
+    <td>z.noteCheckbox(name, options, defaultChecked=[])</td>
+    <td>Create note level dynamic form Checkbox</td>
+  </tr>
+  <tr>
+    <td>z.run(paragraphId)</td>
+    <td>Run paragraph</td>
+  </tr>
+  <tr>
+    <td>z.run(noteId, paragraphId)</td>
+    <td>Run paragraph</td>
+  </tr>
+  <tr>
+    <td>z.runNote(noteId)</td>
+    <td>Run the whole note</td>
+  </tr>
+</table>
 
-#python help
-range?
+## Python environments
 
-#timeit
-%timeit range(100)
-```
+### Default
+By default, PythonInterpreter will use python command defined in `zeppelin.python` property
to run python process.
+The interpreter can use all modules already installed (with pip, easy_install...)
+
+### Conda
+[Conda](http://conda.pydata.org/) is an package management system and environment management
system for python.
+`%python.conda` interpreter lets you change between environments.
 
-Use matplotlib 
+#### Usage
 
-```
-%python.ipython
+- get the Conda Information: 
 
+    ```
+    %python.conda info
+    ```
+    
+- list the Conda environments: 
 
-%matplotlib inline
-import matplotlib.pyplot as plt
+    ```
+    %python.conda env list
+    ```
 
-print("hello world")
-data=[1,2,3,4]
-plt.figure()
-plt.plot(data)
-```
+- create a conda enviornment: 
+
+    ```
+    %python.conda create --name [ENV NAME]
+    ```
+    
+- activate an environment (python interpreter will be restarted): 
 
-We also make `ZeppelinContext` available in IPython Interpreter. You can use `ZeppelinContext`
to create dynamic forms and display pandas DataFrame.
+    ```
+    %python.conda activate [ENV NAME]
+    ```
 
-e.g.
+- deactivate
 
-Create dynamic form
+    ```
+    %python.conda deactivate
+    ```
+    
+- get installed package list inside the current environment
 
-```
-z.input(name='my_name', defaultValue='hello')
-```
+    ```
+    %python.conda list
+    ```
+    
+- install package
 
-Show pandas dataframe
+    ```
+    %python.conda install [PACKAGE NAME]
+    ```
+  
+- uninstall package
+  
+    ```
+    %python.conda uninstall [PACKAGE NAME]
+    ```
 
-```
-import pandas as pd
-df = pd.DataFrame({'id':[1,2,3], 'name':['a','b','c']})
-z.show(df)
+### Docker
 
-```
+`%python.docker` interpreter allows PythonInterpreter creates python process in a specified
docker container.
 
-By default, we would use IPython in `%python.python` if IPython is available. Otherwise it
would fall back to the original Python implementation.
-If you don't want to use IPython, then you can set `zeppelin.python.useIPython` as `false`
in interpreter setting.
+#### Usage
+
+- activate an environment
+
+    ```
+    %python.docker activate [Repository]
+    %python.docker activate [Repository:Tag]
+    %python.docker activate [Image Id]
+    ```
+
+- deactivate
+
+    ```
+    %python.docker deactivate
+    ```
+
+<br/>
+Here is an example
+
+```
+# activate latest tensorflow image as a python environment
+%python.docker activate gcr.io/tensorflow/tensorflow:latest
+```
 
 ## Technical description
 
 For in-depth technical details on current implementation please refer to [python/README.md](https://github.com/apache/zeppelin/blob/master/python/README.md).
 
 
-### Some features not yet implemented in the Python Interpreter
+## Some features not yet implemented in the vanilla Python interpreter
 
 * Interrupt a paragraph execution (`cancel()` method) is currently only supported in Linux
and MacOs. 
 If interpreter runs in another operating system (for instance MS Windows) , interrupt a paragraph
will close the whole interpreter. 
 A JIRA ticket ([ZEPPELIN-893](https://issues.apache.org/jira/browse/ZEPPELIN-893)) is opened
to implement this feature in a next release of the interpreter.
 * Progression bar in webUI  (`getProgress()` method) is currently not implemented.
-* Code-completion is currently not implemented.
-
diff --git a/python/src/main/resources/python/zeppelin_context.py b/python/src/main/resources/python/zeppelin_context.py
index 0eb02db..b0cdadc 100644
--- a/python/src/main/resources/python/zeppelin_context.py
+++ b/python/src/main/resources/python/zeppelin_context.py
@@ -66,12 +66,12 @@ class PyZeppelinContext(object):
             print("fail to call getAsDataFrame as pandas is not installed")
         return pd.read_csv(StringIO(value), sep="\t")
 
-    def angular(self, key, noteId = None, paragraphId = None):
-        return self.z.angular(key, noteId, paragraphId)
-
     def remove(self, key):
         self.z.remove(key)
 
+    def angular(self, key, noteId = None, paragraphId = None):
+        return self.z.angular(key, noteId, paragraphId)
+
     def contains(self, key):
         return self.contains(key)
 
@@ -120,11 +120,11 @@ class PyZeppelinContext(object):
     def runAll(self):
         return self.z.runAll()
 
-    def angular(self, key, noteId = None, paragraphId = None):
+    def angular(self, name, noteId = None, paragraphId = None):
         if noteId == None:
-            return self.z.angular(key, self.z.getInterpreterContext().getNoteId(), paragraphId)
+            return self.z.angular(name, self.z.getInterpreterContext().getNoteId(), paragraphId)
         else:
-            return self.z.angular(key, noteId, paragraphId)
+            return self.z.angular(name, noteId, paragraphId)
 
     def angularBind(self, name, value, noteId = None, paragraphId = None):
         if noteId == None:


Mime
View raw message