hawq-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yo...@apache.org
Subject [34/50] incubator-hawq-docs git commit: restructure to use numbered steps
Date Mon, 31 Oct 2016 22:13:44 GMT
restructure to use numbered steps


Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/commit/284c3ec2
Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/tree/284c3ec2
Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/diff/284c3ec2

Branch: refs/heads/tutorial-proto
Commit: 284c3ec2db38e8d9020826e3bf292efad76c1819
Parents: 54b2c01
Author: Lisa Owen <lowen@pivotal.io>
Authored: Wed Oct 26 08:38:37 2016 -0700
Committer: Lisa Owen <lowen@pivotal.io>
Committed: Wed Oct 26 08:38:37 2016 -0700

----------------------------------------------------------------------
 pxf/HivePXF.html.md.erb | 428 ++++++++++++++++++++++---------------------
 1 file changed, 222 insertions(+), 206 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/284c3ec2/pxf/HivePXF.html.md.erb
----------------------------------------------------------------------
diff --git a/pxf/HivePXF.html.md.erb b/pxf/HivePXF.html.md.erb
index d4b8f01..82fcc25 100644
--- a/pxf/HivePXF.html.md.erb
+++ b/pxf/HivePXF.html.md.erb
@@ -2,7 +2,7 @@
 title: Accessing Hive Data
 ---
 
-Apache Hive is a distributed data warehousing infrastructure.  Hive facilitates managing
large data sets, supporting multiple data formats, including comma-separated value (.csv),
RC, ORC, and parquet. The PXF Hive plug-in reads data stored in Hive, as well as HDFS or HBase.
+Apache Hive is a distributed data warehousing infrastructure.  Hive facilitates managing
large data sets supporting multiple data formats, including comma-separated value (.csv),
RC, ORC, and parquet. The PXF Hive plug-in reads data stored in Hive, as well as HDFS or HBase.
 
 This section describes how to use PXF to access Hive data. Options for querying data stored
in Hive include:
 
@@ -13,7 +13,7 @@ This section describes how to use PXF to access Hive data. Options for querying
 
 Before accessing Hive data with HAWQ and PXF, ensure that:
 
--   The PXF HDFS plug-in is installed on all cluster nodes.
+-   The PXF HDFS plug-in is installed on all cluster nodes. See [Installing PXF Plug-ins](InstallPXFPlugins.html)
for PXF plug-in installation information.
 -   The PXF Hive plug-in is installed on all cluster nodes.
 -   The Hive JAR files and conf directory are installed on all cluster nodes.
 -   You have tested PXF on HDFS.
@@ -69,7 +69,7 @@ The following table summarizes external mapping rules for Hive primitive
types.
 
 Hive supports complex data types including array, struct, map, and union. PXF maps each of
these complex types to `text`.  While HAWQ does not natively support these types, you can
create HAWQ functions or application code to extract subcomponents of these complex data types.
 
-A complex data type example is provided later in this topic.
+An example using complex data types is provided later in this topic.
 
 
 ## <a id="hive_sampledataset"></a>Sample Data Set
@@ -81,27 +81,30 @@ Examples used in this topic will operate on a common data set. This simple
data
 - number\_of\_orders - integer
 - total\_sales - double
 
-Prepare the sample data set for use. First, create a text file:
+Prepare the sample data set for use:
 
-```
-$ vi /tmp/pxf_hive_datafile.txt
-```
+1. First, create a text file:
+
+    ```
+    $ vi /tmp/pxf_hive_datafile.txt
+    ```
 
-Add the following data to `pxf_hive_datafile.txt`, employing a comma `,` to separate the
field values:
+2. Add the following data to `pxf_hive_datafile.txt`; notice the use of the comma `,` to
separate the four field values:
 
-```
-Prague,Jan,101,4875.33
-Rome,Mar,87,1557.39
-Bangalore,May,317,8936.99
-Beijing,Jul,411,11600.67
-San Francisco,Sept,156,6846.34
-Paris,Nov,159,7134.56
-San Francisco,Jan,113,5397.89
-Prague,Dec,333,9894.77
-Bangalore,Jul,271,8320.55
-Beijing,Dec,100,4248.41
-```
-Make note of the path to this file; you will use it in later exercises.
+    ```
+    Prague,Jan,101,4875.33
+    Rome,Mar,87,1557.39
+    Bangalore,May,317,8936.99
+    Beijing,Jul,411,11600.67
+    San Francisco,Sept,156,6846.34
+    Paris,Nov,159,7134.56
+    San Francisco,Jan,113,5397.89
+    Prague,Dec,333,9894.77
+    Bangalore,Jul,271,8320.55
+    Beijing,Dec,100,4248.41
+    ```
+
+Make note of the path to `pxf_hive_datafile.txt`; you will use it in later exercises.
 
 
 ## <a id="hivecommandline"></a>Hive Command Line
@@ -112,32 +115,38 @@ The Hive command line is a subsystem similar to that of `psql`. To start
the Hiv
 $ HADOOP_USER_NAME=hdfs hive
 ```
 
-The default Hive database is named `default`. Create a Hive table named `sales_info` to expose
our sample data set. Create this table in the `default` database:
+The default Hive database is named `default`. 
 
-``` sql
-hive> CREATE TABLE sales_info (location string, month string,
-        number_of_orders int, total_sales double)
-        ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
-        STORED AS textfile;
-```
+### <a id="hivecommandline_createdb"></a>Example: Create a Hive Database
 
-Notice:
+Create a Hive table to expose our sample data set.
 
-- The `STORED AS textfile` subclause instructs Hive to create the table in Textfile (the
default) format.  Hive Textfile format supports comma-, tab-, and space-separated values,
as well as data specified in JSON notation.
-- The `DELIMITED FIELDS TERMINATED BY` subclause identifies the field delimiter within a
data record (line). The `sales_info` table field delimiter is a comma (`,`).
+1. Create a Hive table named `sales_info` in the `default` database:
 
-Load the `pxf_hive_datafile.txt` sample data file into the `sales_info` table you just created:
+    ``` sql
+    hive> CREATE TABLE sales_info (location string, month string,
+            number_of_orders int, total_sales double)
+            ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
+            STORED AS textfile;
+    ```
 
-``` sql
-hive> LOAD DATA local INPATH '/tmp/pxf_hive_datafile.txt'
-        INTO TABLE sales_info;
-```
+    Notice:
 
-Perform a query on `sales_info` to verify the data was loaded successfully:
+    - The `STORED AS textfile` subclause instructs Hive to create the table in Textfile (the
default) format.  Hive Textfile format supports comma-, tab-, and space-separated values,
as well as data specified in JSON notation.
+    - The `DELIMITED FIELDS TERMINATED BY` subclause identifies the field delimiter within
a data record (line). The `sales_info` table field delimiter is a comma (`,`).
 
-``` sql
-hive> SELECT * FROM sales_info;
-```
+2. Load the `pxf_hive_datafile.txt` sample data file into the `sales_info` table you just
created:
+
+    ``` sql
+    hive> LOAD DATA local INPATH '/tmp/pxf_hive_datafile.txt'
+            INTO TABLE sales_info;
+    ```
+
+3. Perform a query on `sales_info` to verify the data was loaded successfully:
+
+    ``` sql
+    hive> SELECT * FROM sales_info;
+    ```
 
 In examples later in this section, you will access the `sales_info` Hive table directly via
PXF. You will also insert `sales_info` data into tables of other Hive file format types, and
use PXF to access those directly as well.
 
@@ -151,149 +160,156 @@ Use the following syntax to create a HAWQ external table representing
Hive data:
 CREATE EXTERNAL TABLE <table_name>
     ( <column_name> <data_type> [, ...] | LIKE <other_table> )
 LOCATION ('pxf://<host>[:<port>]/<hive-db-name>.<hive-table-name>
-    ?PROFILE=Hive|HiveText|HiveRC[&DELIMITER=delim'])
-FORMAT 'CUSTOM|TEXT' (formatter='pxfwritable_import' | delimiter='delim')
+    ?PROFILE=Hive|HiveText|HiveRC[&DELIMITER=<delim>'])
+FORMAT 'CUSTOM|TEXT' (formatter='pxfwritable_import' | delimiter='<delim>')
 ```
 
 Hive-plug-in-specific keywords and values used in the [CREATE EXTERNAL TABLE](../reference/sql/CREATE-EXTERNAL-TABLE.html)
call are described below.
 
 | Keyword  | Value |
 |-------|-------------------------------------|
-| \<host\>    | The HDFS NameNode. |
+| \<host\>[:<port\>]    | The HDFS NameNode and port. |
 | \<hive-db-name\>    | Name of the Hive database. If omitted, defaults to the Hive
database named `default`. |
 | \<hive-table-name\>    | Name of the Hive table. |
 | PROFILE    | The `PROFILE` keyword must specify one of the values `Hive`, `HiveText`, or
`HiveRC`. |
-| DELIMITER    | The `DELIMITER` clause is required for both the `HiveText` and `HiveRC`
profiles and identifies the field delimiter used in the Hive data set.  `delim` must be a
single ascii character or specified in hexadecimal representation. |
+| DELIMITER    | The `DELIMITER` clause is required for both the `HiveText` and `HiveRC`
profiles and identifies the field delimiter used in the Hive data set.  \<delim\> must
be a single ascii character or specified in hexadecimal representation. |
 | FORMAT (`Hive` profile)   | The `FORMAT` clause must specify `CUSTOM`. The `CUSTOM` format
supports only the built-in `pxfwritable_import` `formatter`.   |
-| FORMAT (`HiveText` and `HiveRC` profiles) | The `FORMAT` clause must specify `TEXT`. The
`delimiter` must be specified a second time in `delim`. |
+| FORMAT (`HiveText` and `HiveRC` profiles) | The `FORMAT` clause must specify `TEXT`. The
`delimiter` must be specified a second time in '\<delim\>'. |
 
 
-### <a id="profile_hive"></a>Hive Profile
+## <a id="profile_hive"></a>Hive Profile
 
-The `Hive` profile works with any Hive file format table.
+The `Hive` profile works with any Hive file format.
 
-The following SQL call uses the PXF `Hive` profile to create a queryable HAWQ external table
from the Hive `sales_info` textfile format table created earlier:
+### <a id="profile_hive_using"></a>Example: Using the Hive Profile
 
-``` shell
-$ psql -d postgres
-```
+Use the `Hive` profile to create a queryable HAWQ external table from the Hive `sales_info`
textfile format table created earlier.
 
-``` sql
-postgres=# CREATE EXTERNAL TABLE salesinfo_hiveprofile(location text, month text, num_orders
int, total_sales float8)
-             LOCATION ('pxf://namenode:51200/default.sales_info?PROFILE=Hive')
-             FORMAT 'custom' (formatter='pxfwritable_import');
-```
+1. Create a queryable HAWQ external table from the Hive `sales_info` textfile format table
created earlier:
 
-``` sql
-postgres=# SELECT * FROM salesinfo_hiveprofile;
-```
+    ``` sql
+    postgres=# CREATE EXTERNAL TABLE salesinfo_hiveprofile(location text, month text, num_orders
int, total_sales float8)
+                LOCATION ('pxf://namenode:51200/default.sales_info?PROFILE=Hive')
+              FORMAT 'custom' (formatter='pxfwritable_import');
+    ```
 
-``` shell
-   location    | month | num_orders | total_sales
----------------+-------+------------+-------------
- Prague        | Jan   |        101 |     4875.33
- Rome          | Mar   |         87 |     1557.39
- Bangalore     | May   |        317 |     8936.99
- ...
+2. Query the table:
 
-```
+    ``` sql
+    postgres=# SELECT * FROM salesinfo_hiveprofile;
+    ```
+
+    ``` shell
+       location    | month | num_orders | total_sales
+    ---------------+-------+------------+-------------
+     Prague        | Jan   |        101 |     4875.33
+     Rome          | Mar   |         87 |     1557.39
+     Bangalore     | May   |        317 |     8936.99
+     ...
 
-### <a id="profile_hivetext"></a>HiveText Profile
+    ```
+
+## <a id="profile_hivetext"></a>HiveText Profile
 
 Use the `HiveText` profile to query text formats. The `HiveText` profile is more performant
than the `Hive` profile.
 
 **Note**: When using the `HiveText` profile, you *must* specify a delimiter option in *both*
the `LOCATION` and `FORMAT` clauses.
 
-Use the PXF `HiveText` profile to create a queryable HAWQ external table from the Hive `sales_info`
textfile format table created earlier:
+### <a id="profile_hivetext_using"></a>Example: Using the HiveText Profile
 
-``` sql
-postgres=# CREATE EXTERNAL TABLE salesinfo_hivetextprofile(location text, month text, num_orders
int, total_sales float8)
-             LOCATION ('pxf://namenode:51200/default.sales_info?PROFILE=HiveText&DELIMITER=\x2c')
-             FORMAT 'TEXT' (delimiter=E',');
-```
+Use the PXF `HiveText` profile to create a queryable HAWQ external table from the Hive `sales_info`
textfile format table created earlier.
 
-(You can safely ignore the "nonstandard use of escape in a string literal" warning and related
messages.)
+1. Create the external table:
 
-Notice:
+    ``` sql
+    postgres=# CREATE EXTERNAL TABLE salesinfo_hivetextprofile(location text, month text,
num_orders int, total_sales float8)
+                 LOCATION ('pxf://namenode:51200/default.sales_info?PROFILE=HiveText&DELIMITER=\x2c')
+               FORMAT 'TEXT' (delimiter=E',');
+    ```
 
-- The `LOCATION` subclause `DELIMITER` value is specified in hexadecimal format. `\x` is
a prefix that instructs PXF to interpret the following characters as hexadecimal. `2c` is
the hex value for the comma character.
-- The `FORMAT` subclause `delimiter` value is specified as the single ascii comma character
','. `E` ???
+    (You can safely ignore the "nonstandard use of escape in a string literal" warning and
related messages.)
 
-Query the external table:
+    Notice:
 
-``` sql
-postgres=# SELECT * FROM salesinfo_hivetextprofile where location="Beijing";
-```
+    - The `LOCATION` subclause `DELIMITER` value is specified in hexadecimal format. `\x`
is a prefix that instructs PXF to interpret the following characters as hexadecimal. `2c`
is the hex value for the comma character.
+    - The `FORMAT` subclause `delimiter` value is specified as the single ascii comma character
','. `E` escapes the character.
 
-``` shell
- location | month | num_orders | total_sales
-----------+-------+------------+-------------
- Beijing  | Jul   |        411 |    11600.67
- Beijing  | Dec   |        100 |     4248.41
-(2 rows)
-```
+2. Query the external table:
 
-### <a id="profile_hiverc"></a>HiveRC Profile
+    ``` sql
+    postgres=# SELECT * FROM salesinfo_hivetextprofile where location="Beijing";
+    ```
 
-#### <a id="profile_hiverc_rcfiletbl"></a>RCFile Format Hive Table
+    ``` shell
+     location | month | num_orders | total_sales
+    ----------+-------+------------+-------------
+     Beijing  | Jul   |        411 |    11600.67
+     Beijing  | Dec   |        100 |     4248.41
+    (2 rows)
+    ```
 
-The RCFile Hive format is used for row columnar formatted data. Create a Hive table with
RCFile format:
+## <a id="profile_hiverc"></a>HiveRC Profile
 
-``` shell
-$ HADOOP_USER_NAME=hdfs hive
-```
+The RCFile Hive format is used for row columnar formatted data. The `HiveRC` profile provides
access to RCFile data.
 
-``` sql
-hive> CREATE TABLE sales_info_rcfile (location string, month string,
-        number_of_orders int, total_sales double)
-        ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
-        STORED AS rcfile;
-```
+### <a id="profile_hiverc_rcfiletbl_using"></a>Example: Using the HiveRC Profile
 
-Insert the data from the `sales_info` table into `sales_info_rcfile`:
+Use the `HiveRC` profile to query RCFile-formatted data in Hive tables. The `HiveRC` profile
is more performant than the `Hive` profile for this file format type.
 
-``` sql
-hive> INSERT INTO TABLE sales_info_rcfile SELECT * FROM sales_info;
-```
+1. Create a Hive table with RCFile format:
+
+    ``` shell
+    $ HADOOP_USER_NAME=hdfs hive
+    ```
 
-A copy of the sample data set is now stored in RCFile format in `sales_info_rcfile`. Perform
a Hive query on `sales_info_rcfile` to verify the data was loaded successfully:
+    ``` sql
+    hive> CREATE TABLE sales_info_rcfile (location string, month string,
+            number_of_orders int, total_sales double)
+          ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
+          STORED AS rcfile;
+    ```
 
-``` sql
-hive> SELECT * FROM sales_info_rcfile;
-```
-#### <a id="profile_hiverc_pxfquery"></a>PXF Query
+2. Insert the data from the `sales_info` table into `sales_info_rcfile`:
 
-Use the `HiveRC` profile to query RCFile-formatted data in Hive tables. The `HiveRC` profile
is more performant than the `Hive` profile.
+    ``` sql
+    hive> INSERT INTO TABLE sales_info_rcfile SELECT * FROM sales_info;
+    ```
 
-**Note**: When using the `HiveRC` profile, you *must* specify a delimiter option in *both*
the `LOCATION` and `FORMAT` clauses.
+    A copy of the sample data set is now stored in RCFile format in `sales_info_rcfile`.

+    
+3. Perform a Hive query on `sales_info_rcfile` to verify the data was loaded successfully:
 
-Use the PXF `HiveRC` profile to create a queryable HAWQ external table from the Hive `sales_info_rcfile`
table created in the previous section:
+    ``` sql
+    hive> SELECT * FROM sales_info_rcfile;
+    ```
 
-``` sql
-postgres=# CREATE EXTERNAL TABLE salesinfo_hivercprofile(location text, month text, num_orders
int, total_sales float8)
-             LOCATION ('pxf://namenode:51200/default.sales_info_rcfile?PROFILE=HiveRC&DELIMITER=\x2c')
-             FORMAT 'TEXT' (delimiter=E',');
-```
+4. Use the PXF `HiveRC` profile to create a queryable HAWQ external table from the Hive `sales_info_rcfile`
table created in the previous step. When using the `HiveRC` profile, you **must** specify
a delimiter option in *both* the `LOCATION` and `FORMAT` clauses.:
 
-(Again, you can safely ignore the "nonstandard use of escape in a string literal" warning
and related messages.)
+    ``` sql
+    postgres=# CREATE EXTERNAL TABLE salesinfo_hivercprofile(location text, month text, num_orders
int, total_sales float8)
+                 LOCATION ('pxf://namenode:51200/default.sales_info_rcfile?PROFILE=HiveRC&DELIMITER=\x2c')
+               FORMAT 'TEXT' (delimiter=E',');
+    ```
 
-Query the external table:
+    (Again, you can safely ignore the "nonstandard use of escape in a string literal" warning
and related messages.)
 
-``` sql
-postgres=# SELECT location, total_sales FROM salesinfo_hivercprofile;
-```
+5. Query the external table:
 
-``` shell
-   location    | total_sales
----------------+-------------
- Prague        |     4875.33
- Rome          |     1557.39
- Bangalore     |     8936.99
- Beijing       |    11600.67
- ...
-```
+    ``` sql
+    postgres=# SELECT location, total_sales FROM salesinfo_hivercprofile;
+    ```
 
-### <a id="topic_dbb_nz3_ts"></a>Accessing Parquet-Format Hive Tables
+    ``` shell
+       location    | total_sales
+    ---------------+-------------
+     Prague        |     4875.33
+     Rome          |     1557.39
+     Bangalore     |     8936.99
+     Beijing       |    11600.67
+     ...
+    ```
+
+## <a id="topic_dbb_nz3_ts"></a>Accessing Parquet-Format Hive Tables
 
 The PXF `Hive` profile supports both non-partitioned and partitioned Hive tables that use
the Parquet storage format in HDFS. Simply map the table columns using equivalent HAWQ data
types. For example, if a Hive table is created using:
 
@@ -310,17 +326,17 @@ postgres=# CREATE EXTERNAL TABLE pxf_parquet_table (fname text, lname
text, cust
     FORMAT 'CUSTOM' (formatter='pxfwritable_import');
 ```
 
-### <a id="profileperf"></a>Profile Performance Considerations
+## <a id="profileperf"></a>Profile Performance Considerations
 
 The `HiveRC` and `HiveText` profiles are faster than the generic `Hive` profile.
 
 ?? MORE HERE. ??
 
-### <a id="complex_dt_example"></a>Complex Data Types Example
+## <a id="complex_dt_example"></a>Complex Data Type Example
 
-Prepare a sample data set for use. This data set will employ the array and map complex types,
specifically an array of integers and a string key/value pair map.
+This example will employ the array and map complex types, specifically an array of integers
and a string key/value pair map.
 
-The data set includes fields with the following names and data types:
+The example data set includes fields with the following names and data types:
 
 - index - int
 - name - string
@@ -329,99 +345,99 @@ The data set includes fields with the following names and data types:
 
 When specifying an array field in a Hive table, you must identify the terminator for each
item in the collection. Similarly, the map key termination character must also be specified.
 
-Create a text file from which you will load the data set:
+1. Create a text file from which you will load the data set:
 
-```
-$ vi /tmp/pxf_hive_complex.txt
-```
+    ```
+    $ vi /tmp/pxf_hive_complex.txt
+    ```
 
-Add the following data to `pxf_hive_complex.txt`.  The data uses a comma `,` to separate
field values, the percent symbol `%` to separate collection items, and a `:` to terminate
map key values:
+2. Add the following data to `pxf_hive_complex.txt`.  The data uses a comma `,` to separate
field values, the percent symbol `%` to separate collection items, and a `:` to terminate
map key values:
 
-```
-3,Prague,1%2%3,zone:euro%status:up
-89,Rome,4%5%6,zone:euro
-400,Bangalore,7%8%9,zone:apac%status:pending
-183,Beijing,0%1%2,zone:apac
-94,Sacramento,3%4%5,zone:noam%status:down
-101,Paris,6%7%8,zone:euro%status:up
-56,Frankfurt,9%0%1,zone:euro
-202,Jakarta,2%3%4,zone:apac%status:up
-313,Sydney,5%6%7,zone:apac%status:pending
-76,Atlanta,8%9%0,zone:noam%status:down
-```
+    ```
+    3,Prague,1%2%3,zone:euro%status:up
+    89,Rome,4%5%6,zone:euro
+    400,Bangalore,7%8%9,zone:apac%status:pending
+    183,Beijing,0%1%2,zone:apac
+    94,Sacramento,3%4%5,zone:noam%status:down
+    101,Paris,6%7%8,zone:euro%status:up
+    56,Frankfurt,9%0%1,zone:euro
+    202,Jakarta,2%3%4,zone:apac%status:up
+    313,Sydney,5%6%7,zone:apac%status:pending
+    76,Atlanta,8%9%0,zone:noam%status:down
+    ```
 
-Create a Hive table to represent this data:
+3. Create a Hive table to represent this data:
 
-``` shell
-$ HADOOP_USER_NAME=hdfs hive
-```
+    ``` shell
+    $ HADOOP_USER_NAME=hdfs hive
+    ```
 
-``` sql
-postgres=# CREATE TABLE table_complextypes( index int, name string, intarray ARRAY<int>,
propmap MAP<string, string>)
+    ``` sql
+    hive> CREATE TABLE table_complextypes( index int, name string, intarray ARRAY<int>,
propmap MAP<string, string>)
              ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
              COLLECTION ITEMS TERMINATED BY '%'
              MAP KEYS TERMINATED BY ':'
              STORED AS TEXTFILE;
-```
+    ```
 
-Notice:
+    Notice:
 
-- `FIELDS TERMINATED BY` identifies a comma as the field terminator.
-- The `COLLECTION ITEMS TERMINATED BY` subclause specifies the percent sign as the collection
items (array item, map key/value pair) terminator.
-- `MAP KEYS TERMINATED BY` identifies a colon as the terminator for map keys.
+    - `FIELDS TERMINATED BY` identifies a comma as the field terminator.
+    - The `COLLECTION ITEMS TERMINATED BY` subclause specifies the percent sign as the collection
items (array item, map key/value pair) terminator.
+    - `MAP KEYS TERMINATED BY` identifies a colon as the terminator for map keys.
 
-Load the `pxf_hive_complex.txt` sample data file into the `table_complextypes` table you
just created:
+4. Load the `pxf_hive_complex.txt` sample data file into the `table_complextypes` table you
just created:
 
-``` sql
-hive> LOAD DATA local INPATH '/tmp/pxf_hive_complex.txt' INTO TABLE table_complextypes;
-```
+    ``` sql
+    hive> LOAD DATA local INPATH '/tmp/pxf_hive_complex.txt' INTO TABLE table_complextypes;
+    ```
 
-Perform a query on `table_complextypes` to verify the data was loaded successfully:
+5. Perform a query on Hive table `table_complextypes` to verify the data was loaded successfully:
 
-``` sql
-hive> SELECT * FROM table_complextypes;
-```
+    ``` sql
+    hive> SELECT * FROM table_complextypes;
+    ```
 
-``` shell
-3	Prague	[1,2,3]	{"zone":"euro","status":"up"}
-89	Rome	[4,5,6]	{"zone":"euro"}
-400	Bangalore	[7,8,9]	{"zone":"apac","status":"pending"}
-...
-```
+    ``` shell
+    3	Prague	[1,2,3]	{"zone":"euro","status":"up"}
+    89	Rome	[4,5,6]	{"zone":"euro"}
+    400	Bangalore	[7,8,9]	{"zone":"apac","status":"pending"}
+    ...
+    ```
 
-Use the PXF `Hive` profile to create a queryable HAWQ external table representing the Hive
`table_complextypes`:
+6. Use the PXF `Hive` profile to create a queryable HAWQ external table representing the
Hive `table_complextypes`:
 
-``` sql
-postgres=# CREATE EXTERNAL TABLE complextypes_hiveprofile(index int, name text, intarray
text, propmap text)
-             LOCATION ('pxf://namenode:51200/table_complextypes?PROFILE=Hive')
-             FORMAT 'CUSTOM' (formatter='pxfwritable_import');
-```
+    ``` sql
+    postgres=# CREATE EXTERNAL TABLE complextypes_hiveprofile(index int, name text, intarray
text, propmap text)
+                 LOCATION ('pxf://namenode:51200/table_complextypes?PROFILE=Hive')
+               FORMAT 'CUSTOM' (formatter='pxfwritable_import');
+    ```
 
-Notice that the integer array and map complex types are mapped to type text.
+    Notice that the integer array and map complex types are mapped to type text.
 
-Query the external table:
+7. Query the external table:
 
-``` sql
-postgres=# SELECT * FROM complextypes_hiveprofile;
-```
+    ``` sql
+    postgres=# SELECT * FROM complextypes_hiveprofile;
+    ```
 
-``` shell     
- index |    name    | intarray |              propmap
--------+------------+----------+------------------------------------
-     3 | Prague     | [1,2,3]  | {"zone":"euro","status":"up"}
-    89 | Rome       | [4,5,6]  | {"zone":"euro"}
-   400 | Bangalore  | [7,8,9]  | {"zone":"apac","status":"pending"}
-   183 | Beijing    | [0,1,2]  | {"zone":"apac"}
-    94 | Sacramento | [3,4,5]  | {"zone":"noam","status":"down"}
-   101 | Paris      | [6,7,8]  | {"zone":"euro","status":"up"}
-    56 | Frankfurt  | [9,0,1]  | {"zone":"euro"}
-   202 | Jakarta    | [2,3,4]  | {"zone":"apac","status":"up"}
-   313 | Sydney     | [5,6,7]  | {"zone":"apac","status":"pending"}
-    76 | Atlanta    | [8,9,0]  | {"zone":"noam","status":"down"}
-(10 rows)
-```
+    ``` shell     
+     index |    name    | intarray |              propmap
+    -------+------------+----------+------------------------------------
+         3 | Prague     | [1,2,3]  | {"zone":"euro","status":"up"}
+        89 | Rome       | [4,5,6]  | {"zone":"euro"}
+       400 | Bangalore  | [7,8,9]  | {"zone":"apac","status":"pending"}
+       183 | Beijing    | [0,1,2]  | {"zone":"apac"}
+        94 | Sacramento | [3,4,5]  | {"zone":"noam","status":"down"}
+       101 | Paris      | [6,7,8]  | {"zone":"euro","status":"up"}
+        56 | Frankfurt  | [9,0,1]  | {"zone":"euro"}
+       202 | Jakarta    | [2,3,4]  | {"zone":"apac","status":"up"}
+       313 | Sydney     | [5,6,7]  | {"zone":"apac","status":"pending"}
+        76 | Atlanta    | [8,9,0]  | {"zone":"noam","status":"down"}
+    (10 rows)
+    ```
 
-`intarray` and `propmap` are each text strings.
+    `intarray` and `propmap` are each text strings.
 
 ## <a id="hcatalog"></a>Using PXF and HCatalog to Query Hive
 


Mime
View raw message