hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dyozie <...@git.apache.org>
Subject [GitHub] incubator-hawq-docs pull request #77: HAWQ-1216 - clean up plpython docs
Date Wed, 04 Jan 2017 00:32:40 GMT
Github user dyozie commented on a diff in the pull request:

    https://github.com/apache/incubator-hawq-docs/pull/77#discussion_r94512620
  
    --- Diff: plext/using_plpython.html.md.erb ---
    @@ -2,374 +2,608 @@
     title: Using PL/Python in HAWQ
     ---
     
    -This section contains an overview of the HAWQ PL/Python language extension.
    +This section provides an overview of the HAWQ PL/Python procedural language extension.
     
     ## <a id="abouthawqplpython"></a>About HAWQ PL/Python 
     
    -PL/Python is a loadable procedural language. With the HAWQ PL/Python extension, you can
write HAWQ user-defined functions in Python that take advantage of Python features and modules
to quickly build robust database applications.
    +PL/Python is embedded in your HAWQ product distribution or within your HAWQ build if
you chose to enable it as a build option. 
    +
    +With the HAWQ PL/Python extension, you can write user-defined functions in Python that
take advantage of Python features and modules, enabling you to quickly build robust HAWQ database
applications.
     
     HAWQ uses the system Python installation.
     
     ### <a id="hawqlimitations"></a>HAWQ PL/Python Limitations 
     
    -- HAWQ does not support PL/Python triggers.
    +- HAWQ does not support PL/Python trigger functions.
     - PL/Python is available only as a HAWQ untrusted language.
      
     ## <a id="enableplpython"></a>Enabling and Removing PL/Python Support 
     
    -To use PL/Python in HAWQ, you must either use a pre-compiled version of HAWQ that includes
PL/Python or specify PL/Python as a build option when compiling HAWQ.
    +To use PL/Python in HAWQ, you must either install a binary version of HAWQ that includes
PL/Python or specify PL/Python as a build option when compiling HAWQ from source.
    +
    +PL/Python user-defined functions (UDFs) are registered at the database level. To create
and run a PL/Python UDF on a database, you must register the PL/Python language with the database.

    +
    +On every database to which you want to install and enable PL/Python:
    +
    +1. Connect to the database using the `psql` client:
    +
    +    ``` shell
    +    $ psql -d <dbname>
    +    ```
    +
    +    Replace \<dbname\> with the name of the target database.
    +
    +2. Run the following SQL command to register the PL/Python procedural language; you must
be a database superuser to register new languages:
    +
    +    ``` sql
    +    dbname=# CREATE LANGUAGE plpythonu;
    +    ```
     
    -To create and run a PL/Python user-defined function (UDF) in a database, you must register
the PL/Python language with the database. On every database where you want to install and
enable PL/Python, connect to the database using the `psql` client.
    +    **Note**: `plpythonu` is installed as an *untrusted* language; it offers no way of
restricting what you can program in UDFs created with the language.
     
    -```shell
    -$ psql -d <dbname>
    +To remove support for `plpythonu` from a database, run the following SQL command; you
must be a database superuser to remove a registered procedural language:
    +
    +``` sql
    +dbname=# DROP LANGUAGE plpythonu;
     ```
     
    -Replace \<dbname\> with the name of the target database.
    +## <a id="developfunctions"></a>Developing Functions with PL/Python 
    +
    +PL/Python functions are defined using the standard SQL [CREATE FUNCTION](../reference/sql/CREATE-FUNCTION.html)
syntax.
    +
    +The body of a PL/Python user-defined function is a Python script. When the function is
called, its arguments are passed as elements of the array `args[]`. You can also pass named
arguments as ordinary variables to the Python script. 
     
    -Then, run the following SQL command:
    +PL/Python function results are returned with a `return` statement, or a `yield` statement
in the case of a result-set statement.
     
    -```shell
    -psql# CREATE LANGUAGE plpythonu;
    +The following PL/Python function computes and returns the maximum of two integers:
    +
    +``` sql
    +=> CREATE FUNCTION mypymax (a integer, b integer)
    +     RETURNS integer
    +   AS $$
    +     if (a is None) or (b is None):
    +       return None
    +     if a > b:
    +       return a
    +     return b
    +   $$ LANGUAGE plpythonu;
     ```
     
    -Note that `plpythonu` is installed as an “untrusted” language, meaning it does not
offer any way of restricting what users can do in it.
    +To execute the `mypymax` function:
     
    -To remove support for `plpythonu` from a database, run the following SQL command:
    +``` sql
    +=> SELECT mypymax(5, 7);
    + mypymax 
    +---------
    +       7
    +(1 row)
    +```
    +
    +Adding the `STRICT` keyword to the `LANGUAGE` subclause instructs HAWQ to return null
when any of the input arguments are null. When created as `STRICT`, the function itself need
not perform null checks.
     
    -```shell
    -psql# DROP LANGUAGE plpythonu;
    +The following example uses an unnamed argument, the built-in Python `max()` function,
and the `STRICT` keyword to create a UDF named `mypymax2`:
    +
    +``` sql
    +=> CREATE FUNCTION mypymax2 (a integer, integer)
    +     RETURNS integer AS $$ 
    +   return max(a, args[0]) 
    +   $$ LANGUAGE plpythonu STRICT;
    +=> SELECT mypymax(5, 3);
    + mypymax2
    +----------
    +        5
    +(1 row)
    +=> SELECT mypymax(5, null);
    + mypymax2
    +----------
    +       
    +(1 row)
     ```
     
    -## <a id="developfunctions"></a>Developing Functions with PL/Python 
    +## <a id="example_createtbl"></a>Preparing For Exercises
     
    -The body of a PL/Python user-defined function is a Python script. When the function is
called, its arguments are passed as elements of the array `args[]`. Named arguments are also
passed as ordinary variables to the Python script. The result is returned from the PL/Python
function with return statement, or yield statement in case of a result-set statement.
    +Perform the following steps to create, and insert data into, a simple table. This table
will be used in later exercises.
     
    -The HAWQ PL/Python language module imports the Python module `plpy`. The module `plpy`
implements these functions:
    +1. Create a database named `testdb`:
     
    -- Functions to execute SQL queries and prepare execution plans for queries.
    -   - `plpy.execute`
    -   - `plpy.prepare`
    -   
    -- Functions to manage errors and messages.
    -   - `plpy.debug`
    -   - `plpy.log`
    -   - `plpy.info`
    -   - `plpy.notice`
    -   - `plpy.warning`
    -   - `plpy.error`
    -   - `plpy.fatal`
    -   - `plpy.debug`
    +    ``` shell
    +    gpadmin@hawq-node$ createdb testdb
    +    ```
    +
    +1. Create a table named `sales`:
    +
    +    ``` shell
    +    gpadmin@hawq-node$ psql -d testdb
    +    ```
    +    ``` sql
    +    testdb=> CREATE TABLE sales (id int, year int, qtr int, day int, region text)
    +               DISTRIBUTED BY (id);
    +    ```
    +
    +2. Insert data into the table:
    +
    +    ``` sql
    +    testdb=> INSERT INTO sales VALUES
    +     (1, 2014, 1,1, 'usa'),
    +     (2, 2002, 2,2, 'europe'),
    +     (3, 2014, 3,3, 'asia'),
    +     (4, 2014, 4,4, 'usa'),
    +     (5, 2014, 1,5, 'europe'),
    +     (6, 2014, 2,6, 'asia'),
    +     (7, 2002, 3,7, 'usa') ;
    +    ```
    +
    +## <a id="pymod_intro"></a>Python Modules 
    +A Python module is a text file containing Python statements and definitions. Python modules
are named, with the file name for a module following the `<python-module-name>.py` naming
convention.
    +
    +Should you need to build a Python module, ensure that the appropriate software is installed
on the build system. Also be sure that you are building for the correct deployment architecture,
i.e. 64-bit.
    +
    +### <a id="pymod_intro_hawq"></a>HAWQ Considerations 
    +
    +When installing a Python module in HAWQ, you must add the module to all segment nodes
in the cluster. You must also add all Python modules to any new segment hosts when you expand
your HAWQ cluster.
    +
    +PL/Python supports the built-in HAWQ Python module named `plpy`.  You can also install
3rd party Python modules.
    +
    +
    +## <a id="modules_plpy"></a>plpy Module 
    +
    +The HAWQ PL/Python procedural language extension automatically imports the Python module
`plpy`. `plpy` implements functions to execute SQL queries and prepare execution plans for
queries.  The `plpy` module also includes functions to manage errors and messages.
        
    -## <a id="executepreparesql"></a>Executing and Preparing SQL Queries 
    +### <a id="executepreparesql"></a>Executing and Preparing SQL Queries 
     
    -The PL/Python `plpy` module provides two Python functions to execute an SQL query and
prepare an execution plan for a query, `plpy.execute` and `plpy.prepare`. Preparing the execution
plan for a query is useful if you run the query from multiple Python functions.
    +Use the PL/Python `plpy` module `plpy.execute()` function to execute an SQL query. Use
the `plpy.prepare()` function to prepare an execution plan for a query. Preparing the execution
plan for a query is useful if you plan to run the query from multiple Python functions.
     
    --- End diff --
    
    Change "an SQL" -> "a SQL"  I'm pretty sure the HAWQ docs are otherwise consistent
in this usage.  
    
    Also, change "if you plan to" to "if you want to"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message