hawq-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yo...@apache.org
Subject [06/57] [abbrv] [partial] incubator-hawq-docs git commit: HAWQ-1254 Fix/remove book branching on incubator-hawq-docs
Date Tue, 10 Jan 2017 23:53:57 GMT
http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/overview/TableDistributionStorage.html.md.erb
----------------------------------------------------------------------
diff --git a/overview/TableDistributionStorage.html.md.erb b/overview/TableDistributionStorage.html.md.erb
deleted file mode 100755
index ec1d8b5..0000000
--- a/overview/TableDistributionStorage.html.md.erb
+++ /dev/null
@@ -1,41 +0,0 @@
----
-title: Table Distribution and Storage
----
-
-HAWQ stores all table data, except the system table, in HDFS. When a user creates a table, the metadata is stored on the master's local file system and the table content is stored in HDFS.
-
-In order to simplify table data management, all the data of one relation are saved under one HDFS folder.
-
-For all HAWQ table storage formats, AO \(Append-Only\) and Parquet, the data files are splittable, so that HAWQ can assign multiple virtual segments to consume one data file concurrently. This increases the degree of query parallelism.
-
-## Table Distribution Policy
-
-The default table distribution policy in HAWQ is random.
-
-Randomly distributed tables have some benefits over hash distributed tables. For example, after cluster expansion, HAWQ can use more resources automatically without redistributing the data. For huge tables, redistribution is very expensive, and data locality for randomly distributed tables is better after the underlying HDFS redistributes its data during rebalance or DataNode failures. This is quite common when the cluster is large.
-
-On the other hand, for some queries, hash distributed tables are faster than randomly distributed tables. For example, hash distributed tables have some performance benefits for some TPC-H queries. You should choose the distribution policy that is best suited for your application's scenario.
-
-See [Choosing the Table Distribution Policy](../ddl/ddl-table.html) for more details.
-
-## Data Locality
-
-Data is distributed across HDFS DataNodes. Since remote read involves network I/O, a data locality algorithm improves the local read ratio. HAWQ considers three aspects when allocating data blocks to virtual segments:
-
--   Ratio of local read
--   Continuity of file read
--   Data balance among virtual segments
-
-## External Data Access
-
-HAWQ can access data in external files using the HAWQ Extension Framework (PXF).
-PXF is an extensible framework that allows HAWQ to access data in external
-sources as readable or writable HAWQ tables. PXF has built-in connectors for
-accessing data inside HDFS files, Hive tables, and HBase tables. PXF also
-integrates with HCatalog to query Hive tables directly. See [Using PXF
-with Unmanaged Data](../pxf/HawqExtensionFrameworkPXF.html) for more
-details.
-
-Users can create custom PXF connectors to access other parallel data stores or
-processing engines. Connectors are Java plug-ins that use the PXF API. For more
-information see [PXF External Tables and API](../pxf/PXFExternalTableandAPIReference.html).

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/overview/system-overview.html.md.erb
----------------------------------------------------------------------
diff --git a/overview/system-overview.html.md.erb b/overview/system-overview.html.md.erb
deleted file mode 100644
index 9fc1c53..0000000
--- a/overview/system-overview.html.md.erb
+++ /dev/null
@@ -1,11 +0,0 @@
----
-title: Apache HAWQ (Incubating) System Overview
----
-* <a href="./HAWQOverview.html" class="subnav">What is HAWQ?</a>
-* <a href="./HAWQArchitecture.html" class="subnav">HAWQ Architecture</a>
-* <a href="./TableDistributionStorage.html" class="subnav">Table Distribution and Storage</a>
-* <a href="./ElasticSegments.html" class="subnav">Elastic Virtual Segment Allocation</a>
-* <a href="./ResourceManagement.html" class="subnav">Resource Management</a>
-* <a href="./HDFSCatalogCache.html" class="subnav">HDFS Catalog Cache</a>
-* <a href="./ManagementTools.html" class="subnav">Management Tools</a>
-* <a href="./RedundancyFailover.html" class="subnav">Redundancy and Fault Tolerance</a>

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/plext/UsingProceduralLanguages.html.md.erb
----------------------------------------------------------------------
diff --git a/plext/UsingProceduralLanguages.html.md.erb b/plext/UsingProceduralLanguages.html.md.erb
deleted file mode 100644
index bef1b93..0000000
--- a/plext/UsingProceduralLanguages.html.md.erb
+++ /dev/null
@@ -1,23 +0,0 @@
----
-title: Using Languages and Extensions in HAWQ
----
-
-HAWQ supports user-defined functions that are created with the SQL and C built-in languages, and also supports user-defined aliases for internal functions.
-
-HAWQ also supports user-defined functions written in languages other than SQL and C. These other languages are generically called *procedural languages* (PLs) and are extensions to the core HAWQ functionality. HAWQ specifically supports the PL/Java, PL/Perl, PL/pgSQL, PL/Python, and PL/R procedural languages. 
-
-HAWQ additionally provides the `pgcrypto` extension for password hashing and data encryption.
-
-This chapter describes these languages and extensions:
-
--   <a href="builtin_langs.html">Using HAWQ Built-In Languages</a>
--   <a href="using_pljava.html">Using PL/Java</a>
--   <a href="using_plperl.html">Using PL/Perl</a>
--   <a href="using_plpgsql.html">Using PL/pgSQL</a>
--   <a href="using_plpython.html">Using PL/Python</a>
--   <a href="using_plr.html">Using PL/R</a>
--   <a href="using_pgcrypto.html">Using pgcrypto</a>
-
-
-
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/plext/builtin_langs.html.md.erb
----------------------------------------------------------------------
diff --git a/plext/builtin_langs.html.md.erb b/plext/builtin_langs.html.md.erb
deleted file mode 100644
index 01891e8..0000000
--- a/plext/builtin_langs.html.md.erb
+++ /dev/null
@@ -1,110 +0,0 @@
----
-title: Using HAWQ Built-In Languages
----
-
-This section provides an introduction to using the HAWQ built-in languages.
-
-HAWQ supports user-defined functions created with the SQL and C built-in languages. HAWQ also supports user-defined aliases for internal functions.
-
-
-## <a id="enablebuiltin"></a>Enabling Built-in Language Support
-
-Support for SQL and C language user-defined functions and aliasing of internal functions is enabled by default for all HAWQ databases.
-
-## <a id="builtinsql"></a>Defining SQL Functions
-
-SQL functions execute an arbitrary list of SQL statements. The SQL statements in the body of a SQL function must be separated by semicolons. The final statement in a non-void-returning SQL function must be a [SELECT](../reference/sql/SELECT.html) that returns data of the type specified by the function's return type. The function will return a single or set of rows corresponding to this last SQL query.
-
-The following example creates and calls a SQL function to count the number of rows of the table named `orders`:
-
-``` sql
-gpadmin=# CREATE FUNCTION count_orders() RETURNS bigint AS $$
- SELECT count(*) FROM orders;
-$$ LANGUAGE SQL;
-CREATE FUNCTION
-gpadmin=# SELECT count_orders();
- my_count 
-----------
-   830513
-(1 row)
-```
-
-For additional information about creating SQL functions, refer to [Query Language (SQL) Functions](https://www.postgresql.org/docs/8.2/static/xfunc-sql.html) in the PostgreSQL documentation.
-
-## <a id="builtininternal"></a>Aliasing Internal Functions
-
-Many HAWQ internal functions are written in C. These functions are declared during initialization of the database cluster and statically linked to the HAWQ server. See [Built-in Functions and Operators](../query/functions-operators.html#topic29) for detailed information about HAWQ internal functions.
-
-You cannot define new internal functions, but you can create aliases for existing internal functions.
-
-The following example creates a new function named `all_caps` that is an alias for the `upper` HAWQ internal function:
-
-
-``` sql
-gpadmin=# CREATE FUNCTION all_caps (text) RETURNS text AS 'upper'
-            LANGUAGE internal STRICT;
-CREATE FUNCTION
-gpadmin=# SELECT all_caps('change me');
- all_caps  
------------
- CHANGE ME
-(1 row)
-
-```
-
-For more information about aliasing internal functions, refer to [Internal Functions](https://www.postgresql.org/docs/8.2/static/xfunc-internal.html) in the PostgreSQL documentation.
-
-## <a id="builtinc_lang"></a>Defining C Functions
-
-You must compile user-defined functions written in C into shared libraries so that the HAWQ server can load them on demand. This dynamic loading distinguishes C language functions from internal functions that are written in C.
-
-The [CREATE FUNCTION](../reference/sql/CREATE-FUNCTION.html) call for a user-defined C function must include both the name of the shared library and the name of the function.
-
-If an absolute path to the shared library is not provided, an attempt is made to locate the library relative to the: 
-
-1. HAWQ PostgreSQL library directory (obtained via the `pg_config --pkglibdir` command)
-2. `dynamic_library_path` configuration value
-3. current working directory
-
-in that order. 
-
-Example:
-
-``` c
-#include "postgres.h"
-#include "fmgr.h"
-
-#ifdef PG_MODULE_MAGIC
-PG_MODULE_MAGIC;
-#endif
-
-PG_FUNCTION_INFO_V1(double_it);
-         
-Datum
-double_it(PG_FUNCTION_ARGS)
-{
-    int32   arg = PG_GETARG_INT32(0);
-
-    PG_RETURN_INT64(arg + arg);
-}
-```
-
-If the above function is compiled into a shared object named `libdoubleit.so` located in `/share/libs`, you would register and invoke the function with HAWQ as follows:
-
-``` sql
-gpadmin=# CREATE FUNCTION double_it_c(integer) RETURNS integer
-            AS '/share/libs/libdoubleit', 'double_it'
-            LANGUAGE C STRICT;
-CREATE FUNCTION
-gpadmin=# SELECT double_it_c(27);
- double_it 
------------
-        54
-(1 row)
-
-```
-
-The shared library `.so` extension may be omitted.
-
-For additional information about using the C language to create functions, refer to [C-Language Functions](https://www.postgresql.org/docs/8.2/static/xfunc-c.html) in the PostgreSQL documentation.
-

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/plext/using_pgcrypto.html.md.erb
----------------------------------------------------------------------
diff --git a/plext/using_pgcrypto.html.md.erb b/plext/using_pgcrypto.html.md.erb
deleted file mode 100644
index e3e9225..0000000
--- a/plext/using_pgcrypto.html.md.erb
+++ /dev/null
@@ -1,32 +0,0 @@
----
-title: Enabling Cryptographic Functions for PostgreSQL (pgcrypto)
----
-
-`pgcrypto` is a package extension included in your HAWQ distribution. You must explicitly enable the cryptographic functions to use this extension.
-
-## <a id="pgcryptoprereq"></a>Prerequisites 
-
-
-Before you enable the `pgcrypto` software package, make sure that your HAWQ database is running, you have sourced `greenplum_path.sh`, and that the `$GPHOME` environment variable is set.
-
-## <a id="enablepgcrypto"></a>Enable pgcrypto 
-
-On every database in which you want to enable `pgcrypto`, run the following command:
-
-``` shell
-$ psql -d <dbname> -f $GPHOME/share/postgresql/contrib/pgcrypto.sql
-```
-	
-Replace \<dbname\> with the name of the target database.
-	
-## <a id="uninstallpgcrypto"></a>Disable pgcrypto 
-
-The `uninstall_pgcrypto.sql` script removes `pgcrypto` objects from your database.  On each database in which you enabled `pgcrypto` support, execute the following:
-
-``` shell
-$ psql -d <dbname> -f $GPHOME/share/postgresql/contrib/uninstall_pgcrypto.sql
-```
-
-Replace \<dbname\> with the name of the target database.
-	
-**Note:**  This script does not remove dependent user-created objects.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/plext/using_pljava.html.md.erb
----------------------------------------------------------------------
diff --git a/plext/using_pljava.html.md.erb b/plext/using_pljava.html.md.erb
deleted file mode 100644
index 99b5767..0000000
--- a/plext/using_pljava.html.md.erb
+++ /dev/null
@@ -1,709 +0,0 @@
----
-title: Using PL/Java
----
-
-This section contains an overview of the HAWQ PL/Java language. 
-
-
-## <a id="aboutpljava"></a>About PL/Java 
-
-With the HAWQ PL/Java extension, you can write Java methods using your favorite Java IDE and install the JAR files that implement the methods in your HAWQ cluster.
-
-**Note**: If building HAWQ from source, you must specify PL/Java as a build option when compiling HAWQ. To use PL/Java in a HAWQ deployment, you must explicitly enable the PL/Java extension in all desired databases.  
-
-The HAWQ PL/Java package is based on the open source PL/Java 1.4.0. HAWQ PL/Java provides the following features.
-
-- Ability to execute PL/Java functions with Java 1.6 or 1.7.
-- Standardized utilities (modeled after the SQL 2003 proposal) to install and maintain Java code in the database.
-- Standardized mappings of parameters and result. Complex types as well as sets are supported.
-- An embedded, high performance, JDBC driver utilizing the internal HAWQ Database SPI routines.
-- Metadata support for the JDBC driver. Both `DatabaseMetaData` and `ResultSetMetaData` are included.
-- The ability to return a `ResultSet` from a query as an alternative to building a ResultSet row by row.
-- Full support for savepoints and exception handling.
-- The ability to use IN, INOUT, and OUT parameters.
-- Two separate HAWQ languages:
-	- pljava, TRUSTED PL/Java language
-	- pljavau, UNTRUSTED PL/Java language
-- Transaction and Savepoint listeners enabling code execution when a transaction or savepoint is committed or rolled back.
-- Integration with GNU GCJ on selected platforms.
-
-A function in SQL will appoint a static method in a Java class. In order for the function to execute, the appointed class must available on the class path specified by the HAWQ server configuration parameter `pljava_classpath`. The PL/Java extension adds a set of functions that helps to install and maintain the Java classes. Classes are stored in normal Java archives, JAR files. A JAR file can optionally contain a deployment descriptor that in turn contains SQL commands to be executed when the JAR is deployed or undeployed. The functions are modeled after the standards proposed for SQL 2003.
-
-PL/Java implements a standard way of passing parameters and return values. Complex types and sets are passed using the standard JDBC ResultSet class.
-
-A JDBC driver is included in PL/Java. This driver calls HAWQ internal SPI routines. The driver is essential since it is common for functions to make calls back to the database to fetch data. When PL/Java functions fetch data, they must use the same transactional boundaries that are used by the main function that entered PL/Java execution context.
-
-PL/Java is optimized for performance. The Java virtual machine executes within the same process as the backend to minimize call overhead. PL/Java is designed with the objective to enable the power of Java to the database itself so that database intensive business logic can execute as close to the actual data as possible.
-
-The standard Java Native Interface (JNI) is used when bridging calls between the backend and the Java VM.
-
-
-## <a id="abouthawqpljava"></a>About HAWQ PL/Java 
-
-There are a few key differences between the implementation of PL/Java in standard PostgreSQL and HAWQ.
-
-### <a id="pljavafunctions"></a>Functions 
-
-The following functions are not supported in HAWQ. The classpath is handled differently in a distributed HAWQ environment than in the PostgreSQL environment.
-
-- sqlj.install_jar
-- sqlj.install_jar
-- sqlj.replace_jar
-- sqlj.remove_jar
-- sqlj.get_classpath
-- sqlj.set_classpath
-
-HAWQ uses the `pljava_classpath` server configuration parameter in place of the `sqlj.set_classpath` function.
-
-### <a id="serverconfigparams"></a>Server Configuration Parameters 
-
-PL/Java uses server configuration parameters to configure classpath, Java VM, and other options. Refer to the [Server Configuration Parameter Reference](../reference/HAWQSiteConfig.html) for general information about HAWQ server configuration parameters.
-
-The following server configuration parameters are used by PL/Java in HAWQ. These parameters replace the `pljava.*` parameters that are used in the standard PostgreSQL PL/Java implementation.
-
-#### pljava\_classpath
-
-A colon (:) separated list of the jar files containing the Java classes used in any PL/Java functions. The jar files must be installed in the same locations on all HAWQ hosts. With the trusted PL/Java language handler, jar file paths must be relative to the `$GPHOME/lib/postgresql/java/` directory. With the untrusted language handler (javaU language tag), paths may be relative to `$GPHOME/lib/postgresql/java/` or absolute.
-
-#### pljava\_statement\_cache\_size
-
-Sets the size in KB of the Most Recently Used (MRU) cache for prepared statements.
-
-#### pljava\_release\_lingering\_savepoints
-
-If TRUE, lingering savepoints will be released on function exit. If FALSE, they will be rolled back.
-
-#### pljava\_vmoptions
-
-Defines the start up options for the Java VM.
-
-### <a id="setting_serverconfigparams"></a>Setting PL/Java Configuration Parameters 
-
-You can set PL/Java server configuration parameters at the session level, or globally across your whole cluster. Your HAWQ cluster configuration must be reloaded after setting a server configuration value globally.
-
-#### <a id="setsrvrcfg_global"></a>Cluster Level
-
-You will perform different procedures to set a PL/Java server configuration parameter for your whole HAWQ cluster depending upon whether you manage your cluster from the command line or use Ambari. If you use Ambari to manage your HAWQ cluster, you must ensure that you update server configuration parameters only via the Ambari Web UI. If you manage your HAWQ cluster from the command line, you will use the `hawq config` command line utility to set PL/Java server configuration parameters.
-
-The following examples add a JAR file named `myclasses.jar` to the `pljava_classpath` server configuration parameter for the entire HAWQ cluster.
-
-If you use Ambari to manage your HAWQ cluster:
-
-1. Set the `pljava_classpath` configuration property to include `myclasses.jar` via the HAWQ service **Configs > Advanced > Custom hawq-site** drop down. 
-2. Select **Service Actions > Restart All** to load the updated configuration.
-
-If you manage your HAWQ cluster from the command line:
-
-1.  Log in to the HAWQ master host as a HAWQ administrator and source the file `/usr/local/hawq/greenplum_path.sh`.
-
-    ``` shell
-    $ source /usr/local/hawq/greenplum_path.sh
-    ```
-
-1. Use the `hawq config` utility to set `pljava_classpath`:
-
-    ``` shell
-    $ hawq config -c pljava_classpath -v \'myclasses.jar\'
-    ```
-2. Reload the HAWQ configuration:
-
-    ``` shell
-    $ hawq stop cluster -u
-    ```
-
-#### <a id="setsrvrcfg_session"></a>Session Level 
-
-To set a PL/Java server configuration parameter for only the *current* database session, set the parameter within the `psql` subsystem. For example, to set `pljava_classpath`:
-	
-``` sql
-=> SET pljava_classpath='myclasses.jar';
-```
-
-
-## <a id="enablepljava"></a>Enabling and Removing PL/Java Support 
-
-The PL/Java extension must be explicitly enabled on each database in which it will be used.
-
-
-### <a id="pljavaprereq"></a>Prerequisites 
-
-Before you enable PL/Java:
-
-1. Ensure that you have installed a supported Java runtime environment and that the `$JAVA_HOME` variable is set to the same path on the master and all segment nodes.
-
-2. Perform the following step on all machines to set up `ldconfig` for the installed JDK:
-
-	``` shell
-	$ echo "$JAVA_HOME/jre/lib/amd64/server" > /etc/ld.so.conf.d/libjdk.conf
-	$ ldconfig
-	```
-4. Make sure that your HAWQ cluster is running, you have sourced `greenplum_path.sh` and that your `$GPHOME` environment variable is set.
-
-
-### <a id="enablepljava"></a>Enable PL/Java and Install JAR Files 
-
-To use PL/Java:
-
-1. Enable the language for each database.
-1. Install user-created JAR files on all HAWQ hosts.
-1. Add the names of the JAR files to the HAWQ `pljava_classpath` server configuration parameter. This parameter value should identify a list of the installed JAR files.
-
-#### <a id="enablepljava"></a>Enable PL/Java and Install JAR Files 
-
-Perform the following steps as the `gpadmin` user:
-
-1. Enable PL/Java by running the `$GPHOME/share/postgresql/pljava/install.sql` SQL script in the databases that will use PL/Java. The `install.sql` script registers both the trusted and untrusted PL/Java languages. For example, the following command enables PL/Java on a database named `testdb`:
-
-	``` shell
-	$ psql -d testdb -f $GPHOME/share/postgresql/pljava/install.sql
-	```
-	
-	To enable the PL/Java extension in all new HAWQ databases, run the script on the `template1` database: 
-
-    ``` shell
-    $ psql -d template1 -f $GPHOME/share/postgresql/pljava/install.sql
-    ```
-
-    Use this option *only* if you are certain you want to enable PL/Java in all new databases.
-	
-2. Copy your Java archives (JAR files) to `$GPHOME/lib/postgresql/java/` on all HAWQ hosts. This example uses the `hawq scp` utility to copy the `myclasses.jar` file located in the current directory:
-
-	``` shell
-	$ hawq scp -f hawq_hosts myclasses.jar =:$GPHOME/lib/postgresql/java/
-	```
-	The `hawq_hosts` file contains a list of the HAWQ hosts.
-
-3. Add the JAR files to the `pljava_classpath` configuration parameter. Refer to [Setting PL/Java Configuration Parameters](#setting_serverconfigparams) for the specific procedure.
-
-5. (Optional) Your HAWQ installation includes an `examples.sql` file.  This script contains sample PL/Java functions that you can use for testing. Run the commands in this file to create and run test functions that use the Java classes in `examples.jar`:
-
-	``` shell
-	$ psql -f $GPHOME/share/postgresql/pljava/examples.sql
-	```
-
-#### Configuring PL/Java VM Options
-
-PL/Java JVM options can be configured via the `pljava_vmoptions` server configuration parameter. For example, `pljava_vmoptions=-Xmx512M` sets the maximum heap size of the JVM. The default `-Xmx` value is `64M`.
-
-Refer to [Setting PL/Java Configuration Parameters](#setting_serverconfigparams) for the specific procedure to set PL/Java server configuration parameters.
-
-	
-### <a id="uninstallpljava"></a>Disable PL/Java 
-
-To disable PL/Java, you should:
-
-1. Remove PL/Java support from each database in which it was added.
-2. Uninstall the Java JAR files.
-
-#### <a id="uninstallpljavasupport"></a>Remove PL/Java Support from Databases 
-
-For a database that no longer requires the PL/Java language, remove support for PL/Java by running the `uninstall.sql` script as the `gpadmin` user. For example, the following command disables the PL/Java language in the specified database:
-
-``` shell
-$ psql -d <dbname> -f $GPHOME/share/postgresql/pljava/uninstall.sql
-```
-
-Replace \<dbname\> with the name of the target database.
-
-
-#### <a id="uninstallpljavapackage"></a>Uninstall the Java JAR files 
-
-When no databases have PL/Java as a registered language, remove the Java JAR files.
-
-If you use Ambari to manage your cluster:
-
-1. Remove the `pljava_classpath` configuration property via the HAWQ service **Configs > Advanced > Custom hawq-site** drop down.
-
-2. Remove the JAR files from the `$GPHOME/lib/postgresql/java/` directory of each HAWQ host.
-
-3. Select **Service Actions > Restart All** to restart your HAWQ cluster.
-
-
-If you manage your cluster from the command line:
-
-1.  Log in to the HAWQ master host as a HAWQ administrator and source the file `/usr/local/hawq/greenplum_path.sh`.
-
-    ``` shell
-    $ source /usr/local/hawq/greenplum_path.sh
-    ```
-
-1. Use the `hawq config` utility to remove `pljava_classpath`:
-
-    ``` shell
-    $ hawq config -r pljava_classpath
-    ```
-    
-2. Remove the JAR files from the `$GPHOME/lib/postgresql/java/` directory of each HAWQ host.
-
-3. If you manage your cluster from the command line, run:
-
-    ``` shell
-    $ hawq restart cluster
-    ```
-
-
-## <a id="writingpljavafunc"></a>Writing PL/Java Functions 
-
-This section provides information about writing functions with PL/Java.
-
-- [SQL Declaration](#sqldeclaration)
-- [Type Mapping](#typemapping)
-- [NULL Handling](#nullhandling)
-- [Complex Types](#complextypes)
-- [Returning Complex Types](#returningcomplextypes)
-- [Functions That Return Sets](#functionreturnsets)
-- [Returning a SETOF \<scalar type\>](#returnsetofscalar)
-- [Returning a SETOF \<complex type\>](#returnsetofcomplex)
-
-
-### <a id="sqldeclaration"></a>SQL Declaration 
-
-A Java function is declared with the name of a class and a static method on that class. The class will be resolved using the classpath that has been defined for the schema where the function is declared. If no classpath has been defined for that schema, the public schema is used. If no classpath is found there either, the class is resolved using the system classloader.
-
-The following function can be declared to access the static method getProperty on `java.lang.System` class:
-
-```sql
-=> CREATE FUNCTION getsysprop(VARCHAR)
-     RETURNS VARCHAR
-     AS 'java.lang.System.getProperty'
-   LANGUAGE java;
-```
-
-Run the following command to return the Java `user.home` property:
-
-```sql
-=> SELECT getsysprop('user.home');
-```
-
-### <a id="typemapping"></a>Type Mapping 
-
-Scalar types are mapped in a straightforward way. This table lists the current mappings.
-
-***Table 1: PL/Java data type mappings***
-
-| PostgreSQL | Java |
-|------------|------|
-| bool | boolean |
-| char | byte |
-| int2 | short |
-| int4 | int |
-| int8 | long |
-| varchar | java.lang.String |
-| text | java.lang.String |
-| bytea | byte[ ] |
-| date | java.sql.Date |
-| time | java.sql.Time (stored value treated as local time) |
-| timetz | java.sql.Time |
-| timestamp	| java.sql.Timestamp (stored value treated as local time) |
-| timestampz |	java.sql.Timestamp |
-| complex |	java.sql.ResultSet |
-| setof complex	| java.sql.ResultSet |
-
-All other types are mapped to `java.lang.String` and will utilize the standard textin/textout routines registered for respective type.
-
-### <a id="nullhandling"></a>NULL Handling 
-
-The scalar types that map to Java primitives can not be passed as NULL values. To pass NULL values, those types can have an alternative mapping. You enable this mapping by explicitly denoting it in the method reference.
-
-```sql
-=> CREATE FUNCTION trueIfEvenOrNull(integer)
-     RETURNS bool
-     AS 'foo.fee.Fum.trueIfEvenOrNull(java.lang.Integer)'
-   LANGUAGE java;
-```
-
-The Java code would be similar to this:
-
-```java
-package foo.fee;
-public class Fum
-{
-  static boolean trueIfEvenOrNull(Integer value)
-  {
-    return (value == null)
-      ? true
-      : (value.intValue() % 1) == 0;
-  }
-}
-```
-
-The following two statements both yield true:
-
-```sql
-=> SELECT trueIfEvenOrNull(NULL);
-=> SELECT trueIfEvenOrNull(4);
-```
-
-In order to return NULL values from a Java method, you use the object type that corresponds to the primitive (for example, you return `java.lang.Integer` instead of `int`). The PL/Java resolve mechanism finds the method regardless. Since Java cannot have different return types for methods with the same name, this does not introduce any ambiguity.
-
-### <a id="complextypes"></a>Complex Types 
-
-A complex type will always be passed as a read-only `java.sql.ResultSet` with exactly one row. The `ResultSet` is positioned on its row so a call to `next()` should not be made. The values of the complex type are retrieved using the standard getter methods of the `ResultSet`.
-
-Example:
-
-```sql
-=> CREATE TYPE complexTest
-     AS(base integer, incbase integer, ctime timestamptz);
-=> CREATE FUNCTION useComplexTest(complexTest)
-     RETURNS VARCHAR
-     AS 'foo.fee.Fum.useComplexTest'
-   IMMUTABLE LANGUAGE java;
-```
-
-In the Java class `Fum`, we add the following static method:
-
-```java
-public static String useComplexTest(ResultSet complexTest)
-throws SQLException
-{
-  int base = complexTest.getInt(1);
-  int incbase = complexTest.getInt(2);
-  Timestamp ctime = complexTest.getTimestamp(3);
-  return "Base = \"" + base +
-    "\", incbase = \"" + incbase +
-    "\", ctime = \"" + ctime + "\"";
-}
-```
-
-### <a id="returningcomplextypes"></a>Returning Complex Types 
-
-Java does not stipulate any way to create a `ResultSet`. Hence, returning a ResultSet is not an option. The SQL-2003 draft suggests that a complex return value should be handled as an IN/OUT parameter. PL/Java implements a `ResultSet` that way. If you declare a function that returns a complex type, you will need to use a Java method with boolean return type with a last parameter of type `java.sql.ResultSet`. The parameter will be initialized to an empty updateable ResultSet that contains exactly one row.
-
-Assume that the complexTest type in previous section has been created.
-
-```sql
-=> CREATE FUNCTION createComplexTest(int, int)
-     RETURNS complexTest
-     AS 'foo.fee.Fum.createComplexTest'
-   IMMUTABLE LANGUAGE java;
-```
-
-The PL/Java method resolve will now find the following method in the `Fum` class:
-
-```java
-public static boolean complexReturn(int base, int increment, 
-  ResultSet receiver)
-throws SQLException
-{
-  receiver.updateInt(1, base);
-  receiver.updateInt(2, base + increment);
-  receiver.updateTimestamp(3, new 
-    Timestamp(System.currentTimeMillis()));
-  return true;
-}
-```
-
-The return value denotes if the receiver should be considered as a valid tuple (true) or NULL (false).
-
-### <a id="functionreturnsets"></a>Functions that Return Sets 
-
-When returning result set, you should not build a result set before returning it, because building a large result set would consume a large amount of resources. It is better to produce one row at a time. Incidentally, that is what the HAWQ backend expects a function with SETOF return to do. You can return a SETOF a scalar type such as an int, float or varchar, or you can return a SETOF a complex type.
-
-### <a id="returnsetofscalar"></a>Returning a SETOF \<scalar type\> 
-
-In order to return a set of a scalar type, you need create a Java method that returns something that implements the `java.util.Iterator` interface. Here is an example of a method that returns a SETOF varchar:
-
-```sql
-=> CREATE FUNCTION javatest.getSystemProperties()
-     RETURNS SETOF varchar
-     AS 'foo.fee.Bar.getNames'
-   IMMUTABLE LANGUAGE java;
-```
-
-This simple Java method returns an iterator:
-
-```java
-package foo.fee;
-import java.util.Iterator;
-
-public class Bar
-{
-    public static Iterator getNames()
-    {
-        ArrayList names = new ArrayList();
-        names.add("Lisa");
-        names.add("Bob");
-        names.add("Bill");
-        names.add("Sally");
-        return names.iterator();
-    }
-}
-```
-
-### <a id="returnsetofcomplex"></a>Returning a SETOF \<complex type\> 
-
-A method returning a SETOF <complex type> must use either the interface `org.postgresql.pljava.ResultSetProvider` or `org.postgresql.pljava.ResultSetHandle`. The reason for having two interfaces is that they cater for optimal handling of two distinct use cases. The former is for cases when you want to dynamically create each row that is to be returned from the SETOF function. The latter makes is in cases where you want to return the result of an executed query.
-
-#### Using the ResultSetProvider Interface
-
-This interface has two methods. The boolean `assignRowValues(java.sql.ResultSet tupleBuilder, int rowNumber)` and the `void close()` method. The HAWQ query evaluator will call the `assignRowValues` repeatedly until it returns false or until the evaluator decides that it does not need any more rows. Then it calls close.
-
-You can use this interface the following way:
-
-```sql
-=> CREATE FUNCTION javatest.listComplexTests(int, int)
-     RETURNS SETOF complexTest
-     AS 'foo.fee.Fum.listComplexTest'
-   IMMUTABLE LANGUAGE java;
-```
-
-The function maps to a static java method that returns an instance that implements the `ResultSetProvider` interface.
-
-```java
-public class Fum implements ResultSetProvider
-{
-  private final int m_base;
-  private final int m_increment;
-  public Fum(int base, int increment)
-  {
-    m_base = base;
-    m_increment = increment;
-  }
-  public boolean assignRowValues(ResultSet receiver, int 
-currentRow)
-  throws SQLException
-  {
-    // Stop when we reach 12 rows.
-    //
-    if(currentRow >= 12)
-      return false;
-    receiver.updateInt(1, m_base);
-    receiver.updateInt(2, m_base + m_increment * currentRow);
-    receiver.updateTimestamp(3, new 
-Timestamp(System.currentTimeMillis()));
-    return true;
-  }
-  public void close()
-  {
-   // Nothing needed in this example
-  }
-  public static ResultSetProvider listComplexTests(int base, 
-int increment)
-  throws SQLException
-  {
-    return new Fum(base, increment);
-  }
-}
-```
-
-The `listComplextTests` method is called once. It may return NULL if no results are available or an instance of the `ResultSetProvider`. Here the Java class `Fum` implements this interface so it returns an instance of itself. The method `assignRowValues` will then be called repeatedly until it returns false. At that time, close will be called.
-
-#### Using the ResultSetHandle Interface
-
-This interface is similar to the `ResultSetProvider` interface in that it has a `close()` method that will be called at the end. But instead of having the evaluator call a method that builds one row at a time, this method has a method that returns a `ResultSet`. The query evaluator will iterate over this set and deliver the `ResultSet` contents, one tuple at a time, to the caller until a call to `next()` returns false or the evaluator decides that no more rows are needed.
-
-Here is an example that executes a query using a statement that it obtained using the default connection. The SQL suitable for the deployment descriptor looks like this:
-
-```sql
-=> CREATE FUNCTION javatest.listSupers()
-     RETURNS SETOF pg_user
-     AS 'org.postgresql.pljava.example.Users.listSupers'
-   LANGUAGE java;
-=> CREATE FUNCTION javatest.listNonSupers()
-     RETURNS SETOF pg_user
-     AS 'org.postgresql.pljava.example.Users.listNonSupers'
-   LANGUAGE java;
-```
-
-And in the Java package `org.postgresql.pljava.example` a class `Users` is added:
-
-```java
-public class Users implements ResultSetHandle
-{
-  private final String m_filter;
-  private Statement m_statement;
-  public Users(String filter)
-  {
-    m_filter = filter;
-  }
-  public ResultSet getResultSet()
-  throws SQLException
-  {
-    m_statement = 
-      DriverManager.getConnection("jdbc:default:connection").cr
-eateStatement();
-    return m_statement.executeQuery("SELECT * FROM pg_user 
-       WHERE " + m_filter);
-  }
-
-  public void close()
-  throws SQLException
-  {
-    m_statement.close();
-  }
-
-  public static ResultSetHandle listSupers()
-  {
-    return new Users("usesuper = true");
-  }
-
-  public static ResultSetHandle listNonSupers()
-  {
-    return new Users("usesuper = false");
-  }
-}
-```
-## <a id="usingjdbc"></a>Using JDBC 
-
-PL/Java contains a JDBC driver that maps to the PostgreSQL SPI functions. A connection that maps to the current transaction can be obtained using the following statement:
-
-```java
-Connection conn = 
-  DriverManager.getConnection("jdbc:default:connection"); 
-```
-
-After obtaining a connection, you can prepare and execute statements similar to other JDBC connections. These are limitations for the PL/Java JDBC driver:
-
-- The transaction cannot be managed in any way. Thus, you cannot use methods on the connection such as:
-   - `commit()`
-   - `rollback()`
-   - `setAutoCommit()`
-   - `setTransactionIsolation()`
-- Savepoints are available with some restrictions. A savepoint cannot outlive the function in which it was set and it must be rolled back or released by that same function.
-- A ResultSet returned from `executeQuery()` are always `FETCH_FORWARD` and `CONCUR_READ_ONLY`.
-- Meta-data is only available in PL/Java 1.1 or higher.
-- `CallableStatement` (for stored procedures) is not implemented.
-- The types `Clob` or `Blob` are not completely implemented, they need more work. The types `byte[]` and `String` can be used for `bytea` and `text` respectively.
-
-## <a id="exceptionhandling"></a>Exception Handling 
-
-You can catch and handle an exception in the HAWQ backend just like any other exception. The backend `ErrorData` structure is exposed as a property in a class called `org.postgresql.pljava.ServerException` (derived from `java.sql.SQLException`) and the Java try/catch mechanism is synchronized with the backend mechanism.
-
-**Important:** You will not be able to continue executing backend functions until your function has returned and the error has been propagated when the backend has generated an exception unless you have used a savepoint. When a savepoint is rolled back, the exceptional condition is reset and you can continue your execution.
-
-## <a id="savepoints"></a>Savepoints 
-
-HAWQ savepoints are exposed using the `java.sql.Connection` interface. Two restrictions apply.
-
-- A savepoint must be rolled back or released in the function where it was set.
-- A savepoint must not outlive the function where it was set.
-
-## <a id="logging"></a>Logging 
-
-PL/Java uses the standard Java Logger. Hence, you can write things like:
-
-```java
-Logger.getAnonymousLogger().info( "Time is " + new 
-Date(System.currentTimeMillis()));
-```
-
-At present, the logger uses a handler that maps the current state of the HAWQ configuration setting `log_min_messages` to a valid Logger level and that outputs all messages using the HAWQ backend function `elog()`.
-
-**Note:** The `log_min_messages` setting is read from the database the first time a PL/Java function in a session is executed. On the Java side, the setting does not change after the first PL/Java function execution in a specific session until the HAWQ session that is working with PL/Java is restarted.
-
-The following mapping apply between the Logger levels and the HAWQ backend levels.
-
-***Table 2: PL/Java Logging Levels Mappings***
-
-| java.util.logging.Level | HAWQ Level |
-|-------------------------|------------|
-| SEVERE ERROR | ERROR |
-| WARNING |	WARNING |
-| CONFIG |	LOG |
-| INFO | INFO |
-| FINE | DEBUG1 |
-| FINER | DEBUG2 |
-| FINEST | DEBUG3 |
-
-## <a id="security"></a>Security 
-
-This section describes security aspects of using PL/Java.
-
-### <a id="installation"></a>Installation 
-
-Only a database super user can install PL/Java. The PL/Java utility functions are installed using SECURITY DEFINER so that they execute with the access permissions that where granted to the creator of the functions.
-
-### <a id="trustedlang"></a>Trusted Language 
-
-PL/Java is a trusted language. The trusted PL/Java language has no access to the file system as stipulated by PostgreSQL definition of a trusted language. Any database user can create and access functions in a trusted language.
-
-PL/Java also installs a language handler for the language `javau`. This version is not trusted and only a superuser can create new functions that use it. Any user can call the functions.
-
-
-## <a id="pljavaexample"></a>Example 
-
-The following simple Java example creates a JAR file that contains a single method and runs the method.
-
-<p class="note"><b>Note:</b> The example requires Java SDK to compile the Java file.</p>
-
-The following method returns a substring.
-
-```java
-{
-public static String substring(String text, int beginIndex,
-  int endIndex)
-    {
-    return text.substring(beginIndex, endIndex);
-    }
-}
-```
-
-Enter the Java code in a text file `example.class`.
-
-Contents of the file `manifest.txt`:
-
-```plaintext
-Manifest-Version: 1.0
-Main-Class: Example
-Specification-Title: "Example"
-Specification-Version: "1.0"
-Created-By: 1.6.0_35-b10-428-11M3811
-Build-Date: 01../2013 10:09 AM
-```
-
-Compile the Java code:
-
-```shell
-$ javac *.java
-```
-
-Create a JAR archive named `analytics.jar` that contains the class file and the manifest file in the JAR:
-
-```shell
-$ jar cfm analytics.jar manifest.txt *.class
-```
-
-Upload the JAR file to the HAWQ master host.
-
-Run the `hawq scp` utility to copy the jar file to the HAWQ Java directory. Use the `-f` option to specify the file that contains a list of the master and segment hosts:
-
-```shell
-$ hawq scp -f hawq_hosts analytics.jar =:/usr/local/hawq/lib/postgresql/java/
-```
-
-Add the `analytics.jar` JAR file to the `pljava_classpath` configuration parameter. Refer to [Setting PL/Java Configuration Parameters](#setting_serverconfigparams) for the specific procedure.
-
-From the `psql` subsystem, run the following command to show the installed JAR files:
-
-``` sql
-=> SHOW pljava_classpath
-```
-
-The following SQL commands create a table and define a Java function to test the method in the JAR file:
-
-```sql
-=> CREATE TABLE temp (a varchar) DISTRIBUTED randomly; 
-=> INSERT INTO temp values ('my string'); 
---Example function 
-=> CREATE OR REPLACE FUNCTION java_substring(varchar, int, int) 
-     RETURNS varchar AS 'Example.substring' 
-   LANGUAGE java; 
---Example execution 
-=> SELECT java_substring(a, 1, 5) FROM temp;
-```
-
-If you add these SQL commands to a file named `mysample.sql`, you can run the commands from the `psql` subsystem using the `\i` meta-command:
-
-``` sql
-=> \i mysample.sql 
-```
-
-The output is similar to this:
-
-```shell
-java_substring
-----------------
- y st
-(1 row)
-```
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/plext/using_plperl.html.md.erb
----------------------------------------------------------------------
diff --git a/plext/using_plperl.html.md.erb b/plext/using_plperl.html.md.erb
deleted file mode 100644
index d6ffa04..0000000
--- a/plext/using_plperl.html.md.erb
+++ /dev/null
@@ -1,27 +0,0 @@
----
-title: Using PL/Perl
----
-
-This section contains an overview of the HAWQ PL/Perl language extension.
-
-## <a id="enableplperl"></a>Enabling PL/Perl
-
-If PL/Perl is enabled during HAWQ build time, HAWQ installs the PL/Perl language extension automatically. To use PL/Perl, you must enable it on specific databases.
-
-On every database where you want to enable PL/Perl, connect to the database using the psql client.
-
-``` shell
-$ psql -d <dbname>
-```
-
-Replace \<dbname\> with the name of the target database.
-
-Then, run the following SQL command:
-
-``` shell
-psql# CREATE LANGUAGE plperl;
-```
-
-## <a id="references"></a>References 
-
-For more information on using PL/Perl, see the PostgreSQL PL/Perl documentation at [https://www.postgresql.org/docs/8.2/static/plperl.html](https://www.postgresql.org/docs/8.2/static/plperl.html).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/plext/using_plpgsql.html.md.erb
----------------------------------------------------------------------
diff --git a/plext/using_plpgsql.html.md.erb b/plext/using_plpgsql.html.md.erb
deleted file mode 100644
index 3661e9b..0000000
--- a/plext/using_plpgsql.html.md.erb
+++ /dev/null
@@ -1,142 +0,0 @@
----
-title: Using PL/pgSQL in HAWQ
----
-
-SQL is the language of most other relational databases use as query language. It is portable and easy to learn. But every SQL statement must be executed individually by the database server. 
-
-PL/pgSQL is a loadable procedural language. PL/SQL can do the following:
-
--   create functions
--   add control structures to the SQL language
--   perform complex computations
--   inherit all user-defined types, functions, and operators
--   be trusted by the server
-
-You can use functions created with PL/pgSQL with any database that supports built-in functions. For example, it is possible to create complex conditional computation functions and later use them to define operators or use them in index expressions.
-
-Every SQL statement must be executed individually by the database server. Your client application must send each query to the database server, wait for it to be processed, receive and process the results, do some computation, then send further queries to the server. This requires interprocess communication and incurs network overhead if your client is on a different machine than the database server.
-
-With PL/pgSQL, you can group a block of computation and a series of queries inside the database server, thus having the power of a procedural language and the ease of use of SQL, but with considerable savings of client/server communication overhead.
-
--   Extra round trips between client and server are eliminated
--   Intermediate results that the client does not need do not have to be marshaled or transferred between server and client
--   Multiple rounds of query parsing can be avoided
-
-This can result in a considerable performance increase as compared to an application that does not use stored functions.
-
-PL/pgSQL supports all the data types, operators, and functions of SQL.
-
-**Note:**  PL/pgSQL is automatically installed and registered in all HAWQ databases.
-
-## <a id="supportedargumentandresultdatatypes"></a>Supported Data Types for Arguments and Results 
-
-Functions written in PL/pgSQL accept as arguments any scalar or array data type supported by the server, and they can return a result containing this data type. They can also accept or return any composite type (row type) specified by name. It is also possible to declare a PL/pgSQL function as returning record, which means that the result is a row type whose columns are determined by specification in the calling query. See <a href="#tablefunctions" class="xref">Table Functions</a>.
-
-PL/pgSQL functions can be declared to accept a variable number of arguments by using the VARIADIC marker. This works exactly the same way as for SQL functions. See <a href="#sqlfunctionswithvariablenumbersofarguments" class="xref">SQL Functions with Variable Numbers of Arguments</a>.
-
-PL/pgSQLfunctions can also be declared to accept and return the polymorphic typesanyelement,anyarray,anynonarray, and anyenum. The actual data types handled by a polymorphic function can vary from call to call, as discussed in <a href="http://www.postgresql.org/docs/8.4/static/extend-type-system.html#EXTEND-TYPES-POLYMORPHIC" class="xref">Section 34.2.5</a>. An example is shown in <a href="http://www.postgresql.org/docs/8.4/static/plpgsql-declarations.html#PLPGSQL-DECLARATION-ALIASES" class="xref">Section 38.3.1</a>.
-
-PL/pgSQL functions can also be declared to return a "set" (or table) of any data type that can be returned as a single instance. Such a function generates its output by executing RETURN NEXT for each desired element of the result set, or by using RETURN QUERY to output the result of evaluating a query.
-
-Finally, a PL/pgSQL function can be declared to return void if it has no useful return value.
-
-PL/pgSQL functions can also be declared with output parameters in place of an explicit specification of the return type. This does not add any fundamental capability to the language, but it is often convenient, especially for returning multiple values. The RETURNS TABLE notation can also be used in place of RETURNS SETOF .
-
-This topic describes the following PL/pgSQLconcepts:
-
--   [Table Functions](#tablefunctions)
--   [SQL Functions with Variable number of Arguments](#sqlfunctionswithvariablenumbersofarguments)
--   [Polymorphic Types](#polymorphictypes)
-
-
-## <a id="tablefunctions"></a>Table Functions 
-
-
-Table functions are functions that produce a set of rows, made up of either base data types (scalar types) or composite data types (table rows). They are used like a table, view, or subquery in the FROM clause of a query. Columns returned by table functions can be included in SELECT, JOIN, or WHERE clauses in the same manner as a table, view, or subquery column.
-
-If a table function returns a base data type, the single result column name matches the function name. If the function returns a composite type, the result columns get the same names as the individual attributes of the type.
-
-A table function can be aliased in the FROM clause, but it also can be left unaliased. If a function is used in the FROM clause with no alias, the function name is used as the resulting table name.
-
-Some examples:
-
-```sql
-CREATE TABLE foo (fooid int, foosubid int, fooname text);
-
-CREATE FUNCTION getfoo(int) RETURNS SETOF foo AS $$
-    SELECT * FROM foo WHERE fooid = $1;
-$$ LANGUAGE SQL;
-
-SELECT * FROM getfoo(1) AS t1;
-
-SELECT * FROM foo
-    WHERE foosubid IN (
-                        SELECT foosubid
-                        FROM getfoo(foo.fooid) z
-                        WHERE z.fooid = foo.fooid
-                      );
-
-CREATE VIEW vw_getfoo AS SELECT * FROM getfoo(1);
-
-SELECT * FROM vw_getfoo;
-```
-
-In some cases, it is useful to define table functions that can return different column sets depending on how they are invoked. To support this, the table function can be declared as returning the pseudotype record. When such a function is used in a query, the expected row structure must be specified in the query itself, so that the system can know how to parse and plan the query. Consider this example:
-
-```sql
-SELECT *
-    FROM dblink('dbname=mydb', 'SELECT proname, prosrc FROM pg_proc')
-      AS t1(proname name, prosrc text)
-    WHERE proname LIKE 'bytea%';
-```
-
-The `dblink` function executes a remote query (see `contrib/dblink`). It is declared to return `record` since it might be used for any kind of query. The actual column set must be specified in the calling query so that the parser knows, for example, what `*` should expand to.
-
-
-## <a id="sqlfunctionswithvariablenumbersofarguments"></a>SQL Functions with Variable Numbers of Arguments 
-
-SQL functions can be declared to accept variable numbers of arguments, so long as all the "optional" arguments are of the same data type. The optional arguments will be passed to the function as an array. The function is declared by marking the last parameter as VARIADIC; this parameter must be declared as being of an array type. For example:
-
-```sql
-CREATE FUNCTION mleast(VARIADIC numeric[]) RETURNS numeric AS $$
-    SELECT min($1[i]) FROM generate_subscripts($1, 1) g(i);
-$$ LANGUAGE SQL;
-
-SELECT mleast(10, -1, 5, 4.4);
- mleast 
---------
-     -1
-(1 row)
-```
-
-Effectively, all the actual arguments at or beyond the VARIADIC position are gathered up into a one-dimensional array, as if you had written
-
-```sql
-SELECT mleast(ARRAY[10, -1, 5, 4.4]);    -- doesn't work
-```
-
-You can't actually write that, though; or at least, it will not match this function definition. A parameter marked VARIADIC matches one or more occurrences of its element type, not of its own type.
-
-Sometimes it is useful to be able to pass an already-constructed array to a variadic function; this is particularly handy when one variadic function wants to pass on its array parameter to another one. You can do that by specifying VARIADIC in the call:
-
-```sql
-SELECT mleast(VARIADIC ARRAY[10, -1, 5, 4.4]);
-```
-
-This prevents expansion of the function's variadic parameter into its element type, thereby allowing the array argument value to match normally. VARIADIC can only be attached to the last actual argument of a function call.
-
-
-
-## <a id="polymorphictypes"></a>Polymorphic Types 
-
-Four pseudo-types of special interest are anyelement,anyarray, anynonarray, and anyenum, which are collectively called *polymorphic types*. Any function declared using these types is said to be a*polymorphic function*. A polymorphic function can operate on many different data types, with the specific data type(s) being determined by the data types actually passed to it in a particular call.
-
-Polymorphic arguments and results are tied to each other and are resolved to a specific data type when a query calling a polymorphic function is parsed. Each position (either argument or return value) declared as anyelement is allowed to have any specific actual data type, but in any given call they must all be the sam eactual type. Each position declared as anyarray can have any array data type, but similarly they must all be the same type. If there are positions declared anyarray and others declared anyelement, the actual array type in the anyarray positions must be an array whose elements are the same type appearing in the anyelement positions.anynonarray is treated exactly the same as anyelement, but adds the additional constraint that the actual type must not be an array type. anyenum is treated exactly the same as anyelement, but adds the additional constraint that the actual type must be an enum type.
-
-Thus, when more than one argument position is declared with a polymorphic type, the net effect is that only certain combinations of actual argument types are allowed. For example, a function declared as equal(anyelement, anyelement) will take any two input values, so long as they are of the same data type.
-
-When the return value of a function is declared as a polymorphic type, there must be at least one argument position that is also polymorphic, and the actual data type supplied as the argument determines the actual result type for that call. For example, if there were not already an array subscripting mechanism, one could define a function that implements subscripting `assubscript(anyarray, integer)` returns `anyelement`. This declaration constrains the actual first argument to be an array type, and allows the parser to infer the correct result type from the actual first argument's type. Another example is that a function declared `asf(anyarray)` returns `anyenum` will only accept arrays of `enum` types.
-
-Note that `anynonarray` and `anyenum` do not represent separate type variables; they are the same type as `anyelement`, just with an additional constraint. For example, declaring a function as `f(anyelement,           anyenum)` is equivalent to declaring it as `f(anyenum, anyenum)`; both actual arguments have to be the same enum type.
-
-Variadic functions described in <a href="#sqlfunctionswithvariablenumbersofarguments" class="xref">SQL Functions with Variable Numbers of Arguments</a> can be polymorphic: this is accomplished by declaring its last parameter as `VARIADIC anyarray`. For purposes of argument matching and determining the actual result type, such a function behaves the same as if you had written the appropriate number of `anynonarray` parameters.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/plext/using_plpython.html.md.erb
----------------------------------------------------------------------
diff --git a/plext/using_plpython.html.md.erb b/plext/using_plpython.html.md.erb
deleted file mode 100644
index 063509a..0000000
--- a/plext/using_plpython.html.md.erb
+++ /dev/null
@@ -1,789 +0,0 @@
----
-title: Using PL/Python in HAWQ
----
-
-This section provides an overview of the HAWQ PL/Python procedural language extension.
-
-## <a id="abouthawqplpython"></a>About HAWQ PL/Python 
-
-PL/Python is embedded in your HAWQ product distribution or within your HAWQ build if you chose to enable it as a build option. 
-
-With the HAWQ PL/Python extension, you can write user-defined functions in Python that take advantage of Python features and modules, enabling you to quickly build robust HAWQ database applications.
-
-HAWQ uses the system Python installation.
-
-### <a id="hawqlimitations"></a>HAWQ PL/Python Limitations 
-
-- HAWQ does not support PL/Python trigger functions.
-- PL/Python is available only as a HAWQ untrusted language.
- 
-## <a id="enableplpython"></a>Enabling and Removing PL/Python Support 
-
-To use PL/Python in HAWQ, you must either install a binary version of HAWQ that includes PL/Python or specify PL/Python as a build option when you compile HAWQ from source.
-
-You must register the PL/Python language with a database before you can create and execute a PL/Python UDF on that database. You must be a database superuser to register and remove new languages in HAWQ databases.
-
-On every database to which you want to install and enable PL/Python:
-
-1. Connect to the database using the `psql` client:
-
-    ``` shell
-    gpadmin@hawq-node$ psql -d <dbname>
-    ```
-
-    Replace \<dbname\> with the name of the target database.
-
-2. Run the following SQL command to register the PL/Python procedural language:
-
-    ``` sql
-    dbname=# CREATE LANGUAGE plpythonu;
-    ```
-
-    **Note**: `plpythonu` is installed as an *untrusted* language; it offers no way of restricting what you can program in UDFs created with the language. Creating and executing PL/Python UDFs is permitted only by database superusers and other database users explicitly `GRANT`ed the permissions.
-
-To remove support for `plpythonu` from a database, run the following SQL command; you must be a database superuser to remove a registered procedural language:
-
-``` sql
-dbname=# DROP LANGUAGE plpythonu;
-```
-
-## <a id="developfunctions"></a>Developing Functions with PL/Python 
-
-PL/Python functions are defined using the standard SQL [CREATE FUNCTION](../reference/sql/CREATE-FUNCTION.html) syntax.
-
-The body of a PL/Python user-defined function is a Python script. When the function is called, its arguments are passed as elements of the array `args[]`. You can also pass named arguments as ordinary variables to the Python script. 
-
-PL/Python function results are returned with a `return` statement, or a `yield` statement in the case of a result-set statement.
-
-The following PL/Python function computes and returns the maximum of two integers:
-
-``` sql
-=# CREATE FUNCTION mypymax (a integer, b integer)
-     RETURNS integer
-   AS $$
-     if (a is None) or (b is None):
-       return None
-     if a > b:
-       return a
-     return b
-   $$ LANGUAGE plpythonu;
-```
-
-To execute the `mypymax` function:
-
-``` sql
-=# SELECT mypymax(5, 7);
- mypymax 
----------
-       7
-(1 row)
-```
-
-Adding the `STRICT` keyword to the `LANGUAGE` subclause instructs HAWQ to return null when any of the input arguments are null. When created as `STRICT`, the function itself need not perform null checks.
-
-The following example uses an unnamed argument, the built-in Python `max()` function, and the `STRICT` keyword to create a UDF named `mypymax2`:
-
-``` sql
-=# CREATE FUNCTION mypymax2 (a integer, integer)
-     RETURNS integer AS $$ 
-   return max(a, args[0]) 
-   $$ LANGUAGE plpythonu STRICT;
-=# SELECT mypymax(5, 3);
- mypymax2
-----------
-        5
-(1 row)
-=# SELECT mypymax(5, null);
- mypymax2
-----------
-       
-(1 row)
-```
-
-## <a id="example_createtbl"></a>Creating the Sample Data
-
-Perform the following steps to create, and insert data into, a simple table. This table will be used in later exercises.
-
-1. Create a database named `testdb`:
-
-    ``` shell
-    gpadmin@hawq-node$ createdb testdb
-    ```
-
-1. Create a table named `sales`:
-
-    ``` shell
-    gpadmin@hawq-node$ psql -d testdb
-    ```
-    ``` sql
-    testdb=> CREATE TABLE sales (id int, year int, qtr int, day int, region text)
-               DISTRIBUTED BY (id);
-    ```
-
-2. Insert data into the table:
-
-    ``` sql
-    testdb=> INSERT INTO sales VALUES
-     (1, 2014, 1,1, 'usa'),
-     (2, 2002, 2,2, 'europe'),
-     (3, 2014, 3,3, 'asia'),
-     (4, 2014, 4,4, 'usa'),
-     (5, 2014, 1,5, 'europe'),
-     (6, 2014, 2,6, 'asia'),
-     (7, 2002, 3,7, 'usa') ;
-    ```
-
-## <a id="pymod_intro"></a>Python Modules 
-A Python module is a text file containing Python statements and definitions. Python modules are named, with the file name for a module following the `<python-module-name>.py` naming convention.
-
-Should you need to build a Python module, ensure that the appropriate software is installed on the build system. Also be sure that you are building for the correct deployment architecture, i.e. 64-bit.
-
-### <a id="pymod_intro_hawq"></a>HAWQ Considerations 
-
-When installing a Python module in HAWQ, you must add the module to all segment nodes in the cluster. You must also add all Python modules to any new segment hosts when you expand your HAWQ cluster.
-
-PL/Python supports the built-in HAWQ Python module named `plpy`.  You can also install 3rd party Python modules.
-
-
-## <a id="modules_plpy"></a>plpy Module 
-
-The HAWQ PL/Python procedural language extension automatically imports the Python module `plpy`. `plpy` implements functions to execute SQL queries and prepare execution plans for queries.  The `plpy` module also includes functions to manage errors and messages.
-   
-### <a id="executepreparesql"></a>Executing and Preparing SQL Queries 
-
-Use the PL/Python `plpy` module `plpy.execute()` function to execute a SQL query. Use the `plpy.prepare()` function to prepare an execution plan for a query. Preparing the execution plan for a query is useful if you want to run the query from multiple Python functions.
-
-#### <a id="plpyexecute"></a>plpy.execute() 
-
-Invoking `plpy.execute()` with a query string and an optional limit argument runs the query, returning the result in a Python result object. This result object:
-
-- emulates a list or dictionary object
-- returns rows that can be accessed by row number and column name; row numbering starts with 0 (zero)
-- can be modified
-- includes an `nrows()` method that returns the number of rows returned by the query
-- includes a `status()` method that returns the `SPI_execute()` return value
-
-For example, the following Python statement when present in a PL/Python user-defined function will execute a `SELECT * FROM mytable` query:
-
-``` python
-rv = plpy.execute("SELECT * FROM my_table", 3)
-```
-
-As instructed by the limit argument `3`, the `plpy.execute` function will return up to 3 rows from `my_table`. The result set is stored in the `rv` object.
-
-Access specific columns in the table by name. For example, if `my_table` has a column named `my_column`:
-
-``` python
-my_col_data = rv[i]["my_column"]
-```
-
-You specified that the function return a maximum of 3 rows in the `plpy.execute()` command above. As such, the index `i` used to access the result value `rv` must specify an integer between 0 and 2, inclusive.
-
-##### <a id="plpyexecute_example"></a>Example: plpy.execute()
-
-Example: Use `plpy.execute()` to run a similar query on the `sales` table you created in an earlier section:
-
-1. Define a PL/Python UDF that executes a query to return at most 5 rows from the `sales` table:
-
-    ``` sql
-    =# CREATE OR REPLACE FUNCTION mypytest(a integer) 
-         RETURNS text 
-       AS $$ 
-         rv = plpy.execute("SELECT * FROM sales ORDER BY id", 5)
-         region = rv[a-1]["region"]
-         return region
-       $$ LANGUAGE plpythonu;
-    ```
-
-    When executed, this UDF returns the `region` value from the `id` identified by the input value `a`. Since row numbering of the result set starts at 0, you must access the result set with index `a - 1`. 
-    
-    Specifying the `ORDER BY id` clause in the `SELECT` statement ensures that subsequent invocations of `mypytest` with the same input argument will return identical result sets.
-
-3. Run `mypytest` with an argument identifying `id` `3`:
-
-    ```sql
-    =# SELECT mypytest(3);
-     mypytest 
-    ----------
-     asia
-    (1 row)
-    ```
-    
-    Recall that the row numbering starts from 0 in a Python returned result set. The valid input argument for the `mypytest2` function is an integer between 0 and 4, inclusive.
-
-    The query returns the `region` from the row with `id = 3`, `asia`.
-    
-Note: This example demonstrates some of the concepts discussed previously. It may not be the ideal way to return a specific column value.
-
-#### <a id="plpyprepare"></a>plpy.prepare() 
-
-The function `plpy.prepare()` prepares the execution plan for a query. Preparing the execution plan for a query is useful if you plan to run the query from multiple Python functions.
-
-You invoke `plpy.prepare()` with a query string. Also include a list of parameter types if you are using parameter references in the query. For example, the following statement in a PL/Python user-defined function returns the execution plan for a query:
-
-``` python
-plan = plpy.prepare("SELECT * FROM sales ORDER BY id WHERE 
-  region = $1", [ "text" ])
-```
-
-The string `text` identifies the data type of the variable `$1`. 
-
-After preparing an execution plan, you use the function `plpy.execute()` to run it.  For example:
-
-``` python
-rv = plpy.execute(plan, [ "usa" ])
-```
-
-When executed, `rv` will include all rows in the `sales` table where `region = usa`.
-
-Read on for a description of how one passes data between PL/Python function calls.
-
-##### <a id="plpyprepare_dictionaries"></a>Saving Execution Plans
-
-When you prepare an execution plan using the PL/Python module, the plan is automatically saved. See the [Postgres Server Programming Interface (SPI)](http://www.postgresql.org/docs/8.2/static/spi.html) documentation for information about execution plans.
-
-To make effective use of saved plans across function calls, you use one of the Python persistent storage dictionaries, SD or GD.
-
-The global dictionary SD is available to store data between function calls. This variable is private static data. The global dictionary GD is public data, and is available to all Python functions within a session. *Use GD with care*.
-
-Each function gets its own execution environment in the Python interpreter, so that global data and function arguments from `myfunc1` are not available to `myfunc2`. The exception is the data in the GD dictionary, as mentioned previously.
-
-This example saves an execution plan to the SD dictionary and then executes the plan:
-
-```sql
-=# CREATE FUNCTION usesavedplan() RETURNS text AS $$
-     select1plan = plpy.prepare("SELECT region FROM sales WHERE id=1")
-     SD["s1plan"] = select1plan
-     # other function processing
-     # execute the saved plan
-     rv = plpy.execute(SD["s1plan"])
-     return rv[0]["region"]
-   $$ LANGUAGE plpythonu;
-=# SELECT usesavedplan();
-```
-
-##### <a id="plpyprepare_example"></a>Example: plpy.prepare()
-
-Example: Use `plpy.prepare()` and `plpy.execute()` to prepare and run an execution plan using the GD dictionary:
-
-1. Define a PL/Python UDF to prepare and save an execution plan to the GD. Also  return the name of the plan:
-
-    ``` sql
-    =# CREATE OR REPLACE FUNCTION mypy_prepplan() 
-         RETURNS text 
-       AS $$ 
-         plan = plpy.prepare("SELECT * FROM sales WHERE region = $1 ORDER BY id", [ "text" ])
-         GD["getregionplan"] = plan
-         return "getregionplan"
-       $$ LANGUAGE plpythonu;
-    ```
-
-    This UDF, when run, will return the name (key) of the execution plan generated from the `plpy.prepare()` call.
-
-1. Define a PL/Python UDF to run the execution plan; this function will take the plan name and `region` name as an input:
-
-    ``` sql
-    =# CREATE OR REPLACE FUNCTION mypy_execplan(planname text, regionname text)
-         RETURNS integer 
-       AS $$ 
-         rv = plpy.execute(GD[planname], [ regionname ], 5)
-         year = rv[0]["year"]
-         return year
-       $$ LANGUAGE plpythonu STRICT;
-    ```
-
-    This UDF executes the `planname` plan that was previously saved to the GD. You will call `mypy_execplan()` with the `planname` returned from the `plpy.prepare()` call.
-
-3. Execute the `mypy_prepplan()` and `mypy_execplan()` UDFs, passing `region` `usa`:
-
-    ``` sql
-    =# SELECT mypy_execplan( mypy_prepplan(), 'usa' );
-     mypy_execplan
-    ---------------
-         2014
-    (1 row)
-    ```
-
-### <a id="pythonerrors"></a>Handling Python Errors and Messages 
-
-The `plpy` module implements the following message- and error-related functions, each of which takes a message string as an argument:
-
-- `plpy.debug(msg)`
-- `plpy.log(msg)`
-- `plpy.info(msg)`
-- `plpy.notice(msg)`
-- `plpy.warning(msg)`
-- `plpy.error(msg)`
-- `plpy.fatal(msg)`
-
-`plpy.error()` and `plpy.fatal()` raise a Python exception which, if uncaught, propagates out to the calling query, possibly aborting the current transaction or subtransaction. `raise plpy.ERROR(msg)` and `raise plpy.FATAL(msg)` are equivalent to calling `plpy.error()` and `plpy.fatal()`, respectively. Use the other message functions to generate messages of different priority levels.
-
-Messages may be reported to the client and/or written to the HAWQ server log file.  The HAWQ server configuration parameters [`log_min_messages`](../reference/guc/parameter_definitions.html#log_min_messages) and [`client_min_messages`](../reference/guc/parameter_definitions.html#client_min_messages) control where messages are reported.
-
-#### <a id="plpymessages_example"></a>Example: Generating Messages
-
-In this example, you will create a PL/Python UDF that includes some debug log messages. You will also configure your `psql` session to enable debug-level client logging.
-
-1. Define a PL/Python UDF that executes a query that will return at most 5 rows from the `sales` table. Invoke the `plpy.debug()` method to display some additional information:
-
-    ``` sql
-    =# CREATE OR REPLACE FUNCTION mypytest_debug(a integer) 
-         RETURNS text 
-       AS $$ 
-         plpy.debug('mypytest_debug executing query:  SELECT * FROM sales ORDER BY id')
-         rv = plpy.execute("SELECT * FROM sales ORDER BY id", 5)
-         plpy.debug('mypytest_debug: query returned ' + str(rv.nrows()) + ' rows')
-         region = rv[a]["region"]
-         return region
-       $$ LANGUAGE plpythonu;
-    ```
-
-2. Execute the `mypytest_debug()` UDF, passing the integer `2` as an argument:
-
-    ```sql
-    =# SELECT mypytest_debug(2);
-     mypytest_debug 
-    ----------------
-     asia
-    (1 row)
-    ```
-
-3. Enable `DEBUG2` level client logging:
-
-    ``` sql
-    =# SET client_min_messages=DEBUG2;
-    ```
-    
-2. Execute the `mypytest_debug()` UDF again:
-
-    ```sql
-    =# SELECT mypytest_debug(2);
-    ...
-    DEBUG2:  mypytest_debug executing query:  SELECT * FROM sales ORDER BY id
-    ...
-    DEBUG2:  mypytest_debug: query returned 5 rows
-    ...
-    ```
-
-    Debug output is very verbose. You will parse a lot of output to find the `mypytest_debug` messages. *Hint*: look both near the start and end of the output.
-    
-6. Turn off client-level debug logging:
-
-    ```sql
-    =# SET client_min_messages=NOTICE;
-    ```
-
-## <a id="pythonmodules-3rdparty"></a>3rd-Party Python Modules 
-
-PL/Python supports installation and use of 3rd-party Python Modules. This section includes examples for installing the `setuptools` and NumPy Python modules.
-
-**Note**: You must have superuser privileges to install Python modules to the system Python directories.
-
-### <a id="simpleinstall"></a>Example: Installing setuptools 
-
-In this example, you will manually install the Python `setuptools` module from the Python Package Index repository. `setuptools` enables you to easily download, build, install, upgrade, and uninstall Python packages.
-
-You will first build the module from the downloaded package, installing it on a single host. You will then build and install the module on all segment nodes in your HAWQ cluster.
-
-1. Download the `setuptools` module package from the Python Package Index site. For example, run this `wget` command on a HAWQ node as the `gpadmin` user:
-
-    ``` shell
-    $ ssh gpadmin@<hawq-node>
-    gpadmin@hawq-node$ . /usr/local/hawq/greenplum_path.sh
-    gpadmin@hawq-node$ mkdir plpython_pkgs
-    gpadmin@hawq-node$ cd plpython_pkgs
-    gpadmin@hawq-node$ export PLPYPKGDIR=`pwd`
-    gpadmin@hawq-node$ wget --no-check-certificate https://pypi.python.org/packages/source/s/setuptools/setuptools-18.4.tar.gz
-    ```
-
-2. Extract the files from the `tar.gz` package:
-
-    ``` shell
-    gpadmin@hawq-node$ tar -xzvf setuptools-18.4.tar.gz
-    ```
-
-3. Run the Python scripts to build and install the Python package; you must have superuser privileges to install Python modules to the system Python installation:
-
-    ``` shell
-    gpadmin@hawq-node$ cd setuptools-18.4
-    gpadmin@hawq-node$ python setup.py build 
-    gpadmin@hawq-node$ sudo python setup.py install
-    ```
-
-4. Run the following command to verify the module is available to Python:
-
-    ``` shell
-    gpadmin@hawq-node$ python -c "import setuptools"
-    ```
-    
-    If no error is returned, the `setuptools` module was successfully imported.
-
-5. The `setuptools` package installs the `easy_install` utility. This utility enables you to install Python packages from the Python Package Index repository. For example, this command installs the Python `pip` utility from the Python Package Index site:
-
-    ``` shell
-    gpadmin@hawq-node$ sudo easy_install pip
-    ```
-
-5. Copy the `setuptools` package to all HAWQ nodes in your cluster. For example, this command copies the `tar.gz` file from the current host to the host systems listed in the file `hawq-hosts`:
-
-    ``` shell
-    gpadmin@hawq-node$ cd $PLPYPKGDIR
-    gpadmin@hawq-node$ hawq scp -f hawq-hosts setuptools-18.4.tar.gz =:/home/gpadmin
-    ```
-
-6. Run the commands to build, install, and test the `setuptools` package you just copied to all hosts in your HAWQ cluster. For example:
-
-    ``` shell
-    gpadmin@hawq-node$ hawq ssh -f hawq-hosts
-    >>> mkdir plpython_pkgs
-    >>> cd plpython_pkgs
-    >>> tar -xzvf ../setuptools-18.4.tar.gz
-    >>> cd setuptools-18.4
-    >>> python setup.py build 
-    >>> sudo python setup.py install
-    >>> python -c "import setuptools"
-    >>> exit
-    ```
-
-### <a id="complexinstall"></a>Example: Installing NumPy 
-
-In this example, you will build and install the Python module NumPy. NumPy is a module for scientific computing with Python. For additional information about NumPy, refer to [http://www.numpy.org/](http://www.numpy.org/).
-
-This example assumes `yum` is installed on all HAWQ segment nodes and that the `gpadmin` user is a member of `sudoers` with `root` privileges on the nodes.
-
-#### <a id="complexinstall_prereq"></a>Prerequisites
-Building the NumPy package requires the following software:
-
-- OpenBLAS libraries - an open source implementation of BLAS (Basic Linear Algebra Subprograms)
-- Python development packages - python-devel
-- gcc compilers - gcc, gcc-gfortran, and gcc-c++
-
-Perform the following steps to set up the OpenBLAS compilation environment on each HAWQ node:
-
-1. Use `yum` to install gcc compilers from system repositories. The compilers are required on all hosts where you compile OpenBLAS.  For example:
-
-	``` shell
-	root@hawq-node$ yum -y install gcc gcc-gfortran gcc-c++ python-devel
-	```
-
-2. (Optionally required) If you cannot install the correct compiler versions with `yum`, you have the option to download the gcc compilers, including `gfortran`, from source and build and install them manually. Refer to [Building gfortran from Source](https://gcc.gnu.org/wiki/GFortranBinaries#FromSource) for `gfortran` build and install information.
-
-2. Create a symbolic link to `g++`, naming it `gxx`:
-
-	``` bash
-	root@hawq-node$ ln -s /usr/bin/g++ /usr/bin/gxx
-	```
-
-3. You may also need to create symbolic links to any libraries that have different versions available; for example, linking `libppl_c.so.4` to `libppl_c.so.2`.
-
-4. You can use the `hawq scp` utility to copy files to HAWQ hosts and the `hawq ssh` utility to run commands on those hosts.
-
-
-#### <a id="complexinstall_downdist"></a>Obtaining Packages
-
-Perform the following steps to download and distribute the OpenBLAS and NumPy source packages:
-
-1. Download the OpenBLAS and NumPy source files. For example, these `wget` commands download `tar.gz` files into a `packages` directory in the current working directory:
-
-    ``` shell
-    $ ssh gpadmin@<hawq-node>
-    gpadmin@hawq-node$ wget --directory-prefix=packages http://github.com/xianyi/OpenBLAS/tarball/v0.2.8
-    gpadmin@hawq-node$ wget --directory-prefix=packages http://sourceforge.net/projects/numpy/files/NumPy/1.8.0/numpy-1.8.0.tar.gz/download
-    ```
-
-2. Distribute the software to all nodes in your HAWQ cluster. For example, if you downloaded the software to `/home/gpadmin/packages`, these commands create the `packages` directory on all nodes and copies the software to the nodes listed in the `hawq-hosts` file:
-
-    ``` shell
-    gpadmin@hawq-node$ hawq ssh -f hawq-hosts mkdir packages 
-    gpadmin@hawq-node$ hawq scp -f hawq-hosts packages/* =:/home/gpadmin/packages
-    ```
-
-#### <a id="buildopenblas"></a>Build and Install OpenBLAS Libraries 
-
-Before building and installing the NumPy module, you must first build and install the OpenBLAS libraries. This section describes how to build and install the libraries on a single HAWQ node.
-
-1. Extract the OpenBLAS files from the file:
-
-	``` shell
-	$ ssh gpadmin@<hawq-node>
-	gpadmin@hawq-node$ cd packages
-	gpadmin@hawq-node$ tar xzf v0.2.8 -C /home/gpadmin/packages
-	gpadmin@hawq-node$ mv /home/gpadmin/packages/xianyi-OpenBLAS-9c51cdf /home/gpadmin/packages/OpenBLAS
-	```
-	
-	These commands extract the OpenBLAS tar file and simplify the unpacked directory name.
-
-2. Compile OpenBLAS. You must set the `LIBRARY_PATH` environment variable to the current `$LD_LIBRARY_PATH`. For example:
-
-	``` shell
-	gpadmin@hawq-node$ cd OpenBLAS
-	gpadmin@hawq-node$ export LIBRARY_PATH=$LD_LIBRARY_PATH
-	gpadmin@hawq-node$ make FC=gfortran USE_THREAD=0 TARGET=SANDYBRIDGE
-	```
-	
-	Replace the `TARGET` argument with the target appropriate for your hardware. The `TargetList.txt` file identifies the list of supported OpenBLAS targets.
-	
-	Compiling OpenBLAS make take some time.
-
-3. Install the OpenBLAS libraries in `/usr/local` and then change the owner of the files to `gpadmin`. You must have `root` privileges. For example:
-
-	``` shell
-	gpadmin@hawq-node$ sudo make PREFIX=/usr/local install
-	gpadmin@hawq-node$ sudo ldconfig
-	gpadmin@hawq-node$ sudo chown -R gpadmin /usr/local/lib
-	```
-
-	The following libraries are installed to `/usr/local/lib`, along with symbolic links:
-
-	``` shell
-	gpadmin@hawq-node$ ls -l gpadmin@hawq-node$
-	    ...
-	    libopenblas.a -> libopenblas_sandybridge-r0.2.8.a
-	    libopenblas_sandybridge-r0.2.8.a
-	    libopenblas_sandybridge-r0.2.8.so
-	    libopenblas.so -> libopenblas_sandybridge-r0.2.8.so
-	    libopenblas.so.0 -> libopenblas_sandybridge-r0.2.8.so
-	    ...
-	```
-
-4. Install the OpenBLAS libraries on all nodes in your HAWQ cluster. You can use the `hawq ssh` utility to similarly build and install the OpenBLAS libraries on each of the nodes. 
-
-    Or, you may choose to copy the OpenBLAS libraries you just built to all of the HAWQ cluster nodes. For example, these `hawq ssh` and `hawq scp` commands install prerequisite packages, and copy and install the OpenBLAS libraries on the hosts listed in the `hawq-hosts` file.
-
-    ``` shell
-    $ hawq ssh -f hawq-hosts -e 'sudo yum -y install gcc gcc-gfortran gcc-c++ python-devel'
-    $ hawq ssh -f hawq-hosts -e 'ln -s /usr/bin/g++ /usr/bin/gxx'
-    $ hawq ssh -f hawq-hosts -e sudo chown gpadmin /usr/local/lib
-    $ hawq scp -f hawq-hosts /usr/local/lib/libopen*sandy* =:/usr/local/lib
-    ```
-    ``` shell
-    $ hawq ssh -f hawq-hosts
-    >>> cd /usr/local/lib
-    >>> ln -s libopenblas_sandybridge-r0.2.8.a libopenblas.a
-    >>> ln -s libopenblas_sandybridge-r0.2.8.so libopenblas.so
-    >>> ln -s libopenblas_sandybridge-r0.2.8.so libopenblas.so.0
-    >>> sudo ldconfig
-   ```
-
-#### Build and Install NumPy <a name="buildinstallnumpy"></a>
-
-After you have installed the OpenBLAS libraries, you can build and install NumPy module. These steps install the NumPy module on a single host. You can use the `hawq ssh` utility to build and install the NumPy module on multiple hosts.
-
-1. Extract the NumPy module source files:
-
-	``` shell
-	gpadmin@hawq-node$ cd /home/gpadmin/packages
-	gpadmin@hawq-node$ tar xzf numpy-1.8.0.tar.gz
-	```
-	
-	Unpacking the `numpy-1.8.0.tar.gz` file creates a directory named `numpy-1.8.0` in the current directory.
-
-2. Set up the environment for building and installing NumPy:
-
-	``` shell
-	gpadmin@hawq-node$ export BLAS=/usr/local/lib/libopenblas.a
-	gpadmin@hawq-node$ export LAPACK=/usr/local/lib/libopenblas.a
-	gpadmin@hawq-node$ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/
-	gpadmin@hawq-node$ export LIBRARY_PATH=$LD_LIBRARY_PATH
-	```
-
-3. Build and install NumPy. (Building the NumPy package might take some time.)
-
-	``` shell
-	gpadmin@hawq-node$ cd numpy-1.8.0
-	gpadmin@hawq-node$ python setup.py build
-	gpadmin@hawq-node$ sudo python setup.py install
-	```
-
-	**Note:** If the NumPy module did not successfully build, the NumPy build process might need a `site.cfg` file that specifies the location of the OpenBLAS libraries. Create the `site.cfg` file in the NumPy package directory:
-
-	``` shell
-	gpadmin@hawq-node$ touch site.cfg
-	```
-
-	Add the following to the `site.cfg` file and run the NumPy build command again:
-
-	``` pre
-	[default]
-	library_dirs = /usr/local/lib
-
-	[atlas]
-	atlas_libs = openblas
-	library_dirs = /usr/local/lib
-
-	[lapack]
-	lapack_libs = openblas
-	library_dirs = /usr/local/lib
-
-	# added for scikit-learn 
-	[openblas]
-	libraries = openblas
-	library_dirs = /usr/local/lib
-	include_dirs = /usr/local/include
-	```
-
-4. Verify that the NumPy module is available for import by Python:
-
-	``` shell
-	gpadmin@hawq-node$ cd $HOME
-	gpadmin@hawq-node$ python -c "import numpy"
-	```
-	
-	If no error is returned, the NumPy module was successfully imported.
-
-5. As performed in the `setuptools` Python module installation, use the `hawq ssh` utility to build, install, and test the NumPy module on all HAWQ nodes.
-
-5. The environment variables that were required to build the NumPy module are also required in the `gpadmin` runtime environment to run Python NumPy functions. You can use the `echo` command to add the environment variables to `gpadmin`'s `.bashrc` file. For example, the following `echo` commands add the environment variables to the `.bashrc` file in `gpadmin`'s home directory:
-
-	``` shell
-	$ echo -e '\n#Needed for NumPy' >> ~/.bashrc
-	$ echo -e 'export BLAS=/usr/local/lib/libopenblas.a' >> ~/.bashrc
-	$ echo -e 'export LAPACK=/usr/local/lib/libopenblas.a' >> ~/.bashrc
-	$ echo -e 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib' >> ~/.bashrc
-	$ echo -e 'export LIBRARY_PATH=$LD_LIBRARY_PATH' >> ~/.bashrc
-	```
-
-    You can use the `hawq ssh` utility with these `echo` commands to add the environment variables to the `.bashrc` file on all nodes in your HAWQ cluster.
-
-### <a id="testingpythonmodules"></a>Testing Installed Python Modules 
-
-You can create a simple PL/Python user-defined function (UDF) to validate that a Python module is available in HAWQ. This example tests the NumPy module.
-
-1. Create a PL/Python UDF that imports the NumPy module:
-
-    ``` shell
-    gpadmin@hawq_node$ psql -d testdb
-    ```
-    ``` sql
-    =# CREATE OR REPLACE FUNCTION test_importnumpy(x int)
-       RETURNS text
-       AS $$
-         try:
-             from numpy import *
-             return 'SUCCESS'
-         except ImportError, e:
-             return 'FAILURE'
-       $$ LANGUAGE plpythonu;
-    ```
-
-    The function returns SUCCESS if the module is imported, and FAILURE if an import error occurs.
-
-2. Create a table that loads data on each HAWQ segment instance:
-
-    ``` sql
-    => CREATE TABLE disttbl AS (SELECT x FROM generate_series(1,50) x ) DISTRIBUTED BY (x);
-    ```
-    
-    Depending upon the size of your HAWQ installation, you may need to generate a larger series to ensure data is distributed to all segment instances.
-
-3. Run the UDF on the segment nodes where data is stored in the primary segment instances.
-
-    ``` sql
-    =# SELECT gp_segment_id, test_importnumpy(1) AS status
-         FROM disttbl
-         GROUP BY gp_segment_id, status
-         ORDER BY gp_segment_id, status;
-    ```
-
-    The `SELECT` command returns SUCCESS if the UDF imported the Python module on the HAWQ segment instance. FAILURE is returned if the Python module could not be imported.
-   
-
-#### <a id="testingpythonmodules"></a>Troubleshooting Python Module Import Failures
-
-Possible causes of a Python module import failure include:
-
-- A problem accessing required libraries. For the NumPy example, HAWQ might have a problem accessing the OpenBLAS libraries or the Python libraries on a segment host.
-
-	*Try*: Test importing the module on the segment host. This `hawq ssh` command tests importing the NumPy module on the segment host named mdw1.
-
-	``` shell
-	gpadmin@hawq-node$ hawq ssh -h mdw1 python -c "import numpy"
-	```
-
-- Environment variables may not be configured in the HAWQ environment. The Python import command may not return an error in this case.
-
-	*Try*: Ensure that the environment variables are properly set. For the NumPy example, ensure that the environment variables listed at the end of the section [Build and Install NumPy](#buildinstallnumpy) are defined in the `.bashrc` file for the `gpadmin` user on the master and all segment nodes.
-	
-	**Note:** The `.bashrc` file for the `gpadmin` user on the HAWQ master and all segment nodes must source the `greenplum_path.sh` file.
-
-	
-- HAWQ might not have been restarted after adding environment variable settings to the `.bashrc` file. Again, the Python import command may not return an error in this case.
-
-	*Try*: Ensure that you have restarted HAWQ.
-	
-	``` shell
-	gpadmin@master$ hawq restart cluster
-	```
-
-## <a id="dictionarygd"></a>Using the GD Dictionary to Improve PL/Python Performance 
-
-Importing a Python module is an expensive operation that can adversely affect performance. If you are importing the same module frequently, you can use Python global variables to import the module on the first invocation and forego loading the module on subsequent imports. 
-
-The following PL/Python function uses the GD persistent storage dictionary to avoid importing the module NumPy if it has already been imported in the GD. The UDF includes a call to `plpy.notice()` to display a message when importing the module.
-
-``` sql
-=# CREATE FUNCTION mypy_import2gd() RETURNS text AS $$ 
-     if 'numpy' not in GD:
-       plpy.notice('mypy_import2gd: importing module numpy')
-       import numpy
-       GD['numpy'] = numpy
-     return 'numpy'
-   $$ LANGUAGE plpythonu;
-```
-``` sql
-=# SELECT mypy_import2gd();
-NOTICE:  mypy_import2gd: importing module numpy
-CONTEXT:  PL/Python function "mypy_import2gd"
- mypy_import2gd 
-----------------
- numpy
-(1 row)
-```
-``` sql
-=# SELECT mypy_import2gd();
- mypy_import2gd 
-----------------
- numpy
-(1 row)
-```
-
-The second `SELECT` call does not include the `NOTICE` message, indicating that the module was obtained from the GD.
-
-## <a id="references"></a>References 
-
-This section lists references for using PL/Python.
-
-### <a id="technicalreferences"></a>Technical References 
-
-For information about PL/Python in HAWQ, see the [PL/Python - Python Procedural Language](http://www.postgresql.org/docs/8.2/static/plpython.html) PostgreSQL documentation.
-
-For information about Python Package Index (PyPI), refer to [PyPI - the Python Package Index](https://pypi.python.org/pypi).
-
-The following Python modules may be of interest:
-
-- [SciPy library](http://www.scipy.org/scipylib/index.html) provides user-friendly and efficient numerical routines including those for numerical integration and optimization. To download the SciPy package tar file:
-
-    ``` shell
-    hawq-node$ wget http://sourceforge.net/projects/scipy/files/scipy/0.10.1/scipy-0.10.1.tar.gz
-    ```
-
-- [Natural Language Toolkit](http://www.nltk.org/) (`nltk`) is a platform for building Python programs to work with human language data. 
-
-    The Python [`distribute`](https://pypi.python.org/pypi/distribute/0.6.21) package is required for `nltk`. The `distribute` package should be installed before installing `ntlk`. To download the `distribute` package tar file:
-
-    ``` shell
-    hawq-node$ wget http://pypi.python.org/packages/source/d/distribute/distribute-0.6.21.tar.gz
-    ```
-
-    To download the `nltk` package tar file:
-
-    ``` shell
-    hawq-node$ wget http://pypi.python.org/packages/source/n/nltk/nltk-2.0.2.tar.gz#md5=6e714ff74c3398e88be084748df4e657
-    ```
-
-### <a id="usefulreading"></a>Useful Reading 
-
-For information about the Python language, see [http://www.python.org/](http://www.python.org/).
-
-A set of slides that were used in a talk about how the Pivotal Data Science team uses the PyData stack in the Pivotal MPP databases and on Pivotal Cloud Foundry [http://www.slideshare.net/SrivatsanRamanujam/all-thingspythonpivotal](http://www.slideshare.net/SrivatsanRamanujam/all-thingspythonpivotal).
-



Mime
View raw message