hawq-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yo...@apache.org
Subject incubator-hawq-docs git commit: HAWQ-1376 - clarify pxf host and port description (closes #99)
Date Fri, 10 Mar 2017 02:15:54 GMT
Repository: incubator-hawq-docs
Updated Branches:
  refs/heads/develop dcb5cadfc -> 5714ce5b3


HAWQ-1376 - clarify pxf host and port description (closes #99)


Project: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/commit/5714ce5b
Tree: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/tree/5714ce5b
Diff: http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/diff/5714ce5b

Branch: refs/heads/develop
Commit: 5714ce5b3efb61387e6479907ada58f5aa8f34aa
Parents: dcb5cad
Author: Lisa Owen <lowen@pivotal.io>
Authored: Thu Mar 9 18:15:45 2017 -0800
Committer: David Yozie <yozie@apache.org>
Committed: Thu Mar 9 18:15:45 2017 -0800

----------------------------------------------------------------------
 .../HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb    | 4 ++++
 markdown/pxf/HBasePXF.html.md.erb                               | 2 +-
 markdown/pxf/HDFSFileDataPXF.html.md.erb                        | 3 ++-
 markdown/pxf/HDFSWritablePXF.html.md.erb                        | 3 ++-
 markdown/pxf/HivePXF.html.md.erb                                | 3 ++-
 markdown/pxf/JsonPXF.html.md.erb                                | 5 +++--
 markdown/pxf/PXFExternalTableandAPIReference.html.md.erb        | 4 ++--
 markdown/pxf/TroubleshootingPXF.html.md.erb                     | 2 +-
 markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb        | 4 ++--
 9 files changed, 19 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb b/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
index 6923494..20892f6 100644
--- a/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
+++ b/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
@@ -240,3 +240,7 @@ For command-line administrators:
 	$ hawq init standby -n -M fast
 
 	```
+
+## <a id="pxfnhdfsnamenode"></a>Using PXF with HDFS NameNode HA
+
+If HDFS NameNode High Availability is enabled, use the HDFS Nameservice ID in the `LOCATION`
clause \<host\> field when invoking any PXF `CREATE EXTERNAL TABLE` command. If the
\<port\> is omitted from the `LOCATION` URI, PXF connects to the port number designated
by the `pxf_service_port` server configuration parameter value (default is 51200).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/HBasePXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/HBasePXF.html.md.erb b/markdown/pxf/HBasePXF.html.md.erb
index 3be06d2..ddb86d5 100644
--- a/markdown/pxf/HBasePXF.html.md.erb
+++ b/markdown/pxf/HBasePXF.html.md.erb
@@ -43,7 +43,7 @@ To create an external HBase table, use the following syntax:
 ``` sql
 CREATE [READABLE|WRITABLE] EXTERNAL TABLE table_name 
     ( column_name data_type [, ...] | LIKE other_table )
-LOCATION ('pxf://namenode[:port]/hbase-table-name?Profile=HBase')
+LOCATION ('pxf://host[:port]/hbase-table-name?Profile=HBase')
 FORMAT 'CUSTOM' (Formatter='pxfwritable_import');
 ```
 

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/HDFSFileDataPXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/HDFSFileDataPXF.html.md.erb b/markdown/pxf/HDFSFileDataPXF.html.md.erb
index 6780650..47b964f 100644
--- a/markdown/pxf/HDFSFileDataPXF.html.md.erb
+++ b/markdown/pxf/HDFSFileDataPXF.html.md.erb
@@ -100,7 +100,8 @@ HDFS-plug-in-specific keywords and values used in the [CREATE EXTERNAL
TABLE](..
 
 | Keyword  | Value |
 |-------|-------------------------------------|
-| \<host\>[:\<port\>]    | The HDFS NameNode and port. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node,
use the HDFS NameNode as it is guaranteed to be available in a running HDFS cluster. If HDFS
High Availability is enabled, \<host\> must identify the HDFS NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\>
identifies a High Availability HDFS Nameservice and connects to the port number designated
by the `pxf_service_port` server configuration parameter value. Default is 51200. |
 | \<path-to-hdfs-file\>    | The path to the file in the HDFS data store. |
 | PROFILE    | The `PROFILE` keyword must specify one of the values `HdfsTextSimple`, `HdfsTextMulti`,
or `Avro`. |
 | \<custom-option\>  | \<custom-option\> is profile-specific. Profile-specific
options are discussed in the relevant profile topic later in this section.|

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/HDFSWritablePXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/HDFSWritablePXF.html.md.erb b/markdown/pxf/HDFSWritablePXF.html.md.erb
index 021b6b9..0c498a2 100644
--- a/markdown/pxf/HDFSWritablePXF.html.md.erb
+++ b/markdown/pxf/HDFSWritablePXF.html.md.erb
@@ -54,7 +54,8 @@ HDFS-plug-in-specific keywords and values used in the [CREATE EXTERNAL TABLE](..
 
 | Keyword  | Value |
 |-------|-------------------------------------|
-| \<host\>[:\<port\>]    | The HDFS NameNode and port. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node,
use the HDFS NameNode as it is guaranteed to be available in a running HDFS cluster. If HDFS
High Availability is enabled, \<host\> must identify the HDFS NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\>
identifies a High Availability HDFS Nameservice and connects to the port number designated
by the `pxf_service_port` server configuration parameter value. Default is 51200. |
 | \<path-to-hdfs-file\>    | The path to the file in the HDFS data store. |
 | PROFILE    | The `PROFILE` keyword must specify one of the values `HdfsTextSimple` or `SequenceWritable`.
|
 | \<custom-option\>  | \<custom-option\> is profile-specific. These options are
discussed in the next topic.|

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/HivePXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/HivePXF.html.md.erb b/markdown/pxf/HivePXF.html.md.erb
index 6101016..bc4e9f6 100644
--- a/markdown/pxf/HivePXF.html.md.erb
+++ b/markdown/pxf/HivePXF.html.md.erb
@@ -332,7 +332,8 @@ Hive-plug-in-specific keywords and values used in the [CREATE EXTERNAL
TABLE](..
 
 | Keyword  | Value |
 |-------|-------------------------------------|
-| \<host\>[:<port\>]    | The HDFS NameNode and port. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node,
use the HDFS NameNode as it is guaranteed to be available in a running HDFS cluster. If HDFS
High Availability is enabled, \<host\> must identify the HDFS NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\>
identifies a High Availability HDFS Nameservice and connects to the port number designated
by the `pxf_service_port` server configuration parameter value. Default is 51200. |
 | \<hive-db-name\>    | The name of the Hive database. If omitted, defaults to the
Hive database named `default`. |
 | \<hive-table-name\>    | The name of the Hive table. |
 | PROFILE    | The `PROFILE` keyword must specify one of the values `Hive`, `HiveText`, or
`HiveRC`. |

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/JsonPXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/JsonPXF.html.md.erb b/markdown/pxf/JsonPXF.html.md.erb
index 5f156c4..6aeea7e 100644
--- a/markdown/pxf/JsonPXF.html.md.erb
+++ b/markdown/pxf/JsonPXF.html.md.erb
@@ -169,7 +169,8 @@ JSON-plug-in-specific keywords and values used in the `CREATE EXTERNAL
TABLE` ca
 
 | Keyword  | Value |
 |-------|-------------------------------------|
-| \<host\>    | Specify the HDFS NameNode in the \<host\> field. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node,
use the HDFS NameNode as it is guaranteed to be available in a running HDFS cluster. If HDFS
High Availability is enabled, \<host\> must identify the HDFS NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\>
identifies a High Availability HDFS Nameservice and connects to the port number designated
by the `pxf_service_port` server configuration parameter value. Default is 51200. |
 | PROFILE    | The `PROFILE` keyword must specify the value `Json`. |
 | IDENTIFIER  | Include the `IDENTIFIER` keyword and \<value\> in the `LOCATION` string
only when accessing a JSON file with multi-line records. \<value\> should identify the
member name used to determine the encapsulating JSON object to return.  (If the JSON file
is the multi-line record Example 2 above, `&IDENTIFIER=created_at` would be specified.)
|  
 | FORMAT    | The `FORMAT` clause must specify `CUSTOM`. |
@@ -213,4 +214,4 @@ To query this external table populated with JSON data:
 
 ``` sql
 SELECT * FROM sample_json_multiline_tbl;
-```
\ No newline at end of file
+```

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb b/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb
index 8a29d1d..3681079 100644
--- a/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb
+++ b/markdown/pxf/PXFExternalTableandAPIReference.html.md.erb
@@ -53,8 +53,8 @@ FORMAT 'custom' (formatter='pxfwritable_import|pxfwritable_export');
 
 | Parameter               | Value and description                                       
                                                                                         
                                                                                         
                              |
 |-------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| host                    | The HDFS NameNode.                                          
                                                                                         
                                                                                         
            |
-| port                    | Connection port for the PXF service. If the port is omitted,
PXF assumes that High Availability (HA) is enabled and connects to the HA name service port,
51200, by default. The HA name service port can be changed by setting the `pxf_service_port`
configuration parameter. |
+| \<host\>    | The PXF host. While \<host\> may identify any PXF agent node,
use the HDFS NameNode as it is guaranteed to be available in a running HDFS cluster. If HDFS
High Availability is enabled, \<host\> must identify the HDFS NameService. |
+| \<port\>    | The PXF port. If \<port\> is omitted, PXF assumes \<host\>
identifies a High Availability HDFS Nameservice and connects to the port number designated
by the `pxf_service_port` server configuration parameter value. Default is 51200. |
 | \<path\-to\-data\>        | A directory, file name, wildcard pattern, table name,
etc.                                                                                     
                                                                                         
                                     |
 | PROFILE              | The profile PXF uses to access the data. PXF supports multiple plug-ins
that currently expose profiles named `HBase`, `Hive`, `HiveRC`, `HiveText`, `HiveORC`,  `HdfsTextSimple`,
`HdfsTextMulti`, `Avro`, `SequenceWritable`, and `Json`.                                 
                                                                                         
                                                       |
 | FRAGMENTER              | The Java class the plug-in uses for fragmenting data. Used for
READABLE external tables only.                                                           
                                                                                         
                             |

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/pxf/TroubleshootingPXF.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/pxf/TroubleshootingPXF.html.md.erb b/markdown/pxf/TroubleshootingPXF.html.md.erb
index 57fe9d5..cf1ef13 100644
--- a/markdown/pxf/TroubleshootingPXF.html.md.erb
+++ b/markdown/pxf/TroubleshootingPXF.html.md.erb
@@ -81,7 +81,7 @@ The following table lists some common errors encountered while using PXF:
 </tr>
 <tr class="odd">
 <td>ERROR: fail to get filesystem credential for uri hdfs://&lt;namenode&gt;:8020/</td>
-<td>Secure PXF: Wrong HDFS host or port is not 8020 (this is a limitation that will
be removed in the next release)</td>
+<td>Secure PXF: Wrong HDFS host or port is not 8020</td>
 </tr>
 <tr class="even">
 <td>ERROR: remote component error (413) from '&lt;x&gt;': HTTP status code
is 413 but HTTP response string is empty</td>

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/5714ce5b/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb b/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb
index c46870c..c458cae 100644
--- a/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb
+++ b/markdown/reference/sql/CREATE-EXTERNAL-TABLE.html.md.erb
@@ -165,7 +165,7 @@ The `FORMAT` clause is used to describe how external table files are formatted.
 <dd>The data type of the column.</dd>
 
 <dt>LOCATION ('\<protocol\>://\<host\>\[:\<port\>\]/\<path\>/\<file\>'
\[, ...\])   </dt>
-<dd>For readable external tables, specifies the URI of the external data source(s)
to be used to populate the external table or web table. Regular readable external tables allow
the `file`, `gpfdist`, and `pxf` protocols. Web external tables allow the `http` protocol.
If \<port\> is omitted, the `http` and `gpfdist` protocols assume port `8080` and the
`pxf` protocol assumes the \<host\> is a high availability nameservice string. If using
the `gpfdist` protocol, the \<path\> is relative to the directory from which `gpfdist`
is serving files (the directory specified when you started the `gpfdist` program). Also, the
\<path\> can use wildcards (or other C-style pattern matching) in the \<file\>
name part of the location to denote multiple files in a directory. For example:
+<dd>For readable external tables, specifies the URI of the external data source(s)
to be used to populate the external table or web table. Regular readable external tables allow
the `file`, `gpfdist`, and `pxf` protocols. Web external tables allow the `http` protocol.
If \<port\> is omitted, the `http` and `gpfdist` protocols assume port `8080` and the
`pxf` protocol assumes the \<host\> specifies a high availability Nameservice ID. If
using the `gpfdist` protocol, the \<path\> is relative to the directory from which `gpfdist`
is serving files (the directory specified when you started the `gpfdist` program). Also, the
\<path\> can use wildcards (or other C-style pattern matching) in the \<file\>
name part of the location to denote multiple files in a directory. For example:
 
 ``` pre
 'gpfdist://filehost:8081/*'
@@ -183,7 +183,7 @@ For writable external tables, specifies the URI location of the `gpfdist`
proces
 
 With two `gpfdist` locations listed as in the above example, half of the segments would send
their output data to the `data1.out` file and the other half to the `data2.out` file.
 
-For the `pxf` protocol, the `LOCATION` string specifies the \<host\> and \<port\>
of the PXF service, the location of the data, and the PXF plug-ins (Java classes) used to
convert the data between storage format and HAWQ format. If the \<port\> is omitted,
the \<host\> is taken to be the logical name for the high availability name service
and the \<port\> is the value of the `pxf_service_port` configuration variable, 51200
by default. The URL parameters `FRAGMENTER`, `ACCESSOR`, and `RESOLVER` are the names of PXF
plug-ins (Java classes) that convert between the external data format and HAWQ data format.
The `FRAGMENTER` parameter is only used with readable external tables. PXF allows combinations
of these parameters to be configured as profiles so that a single `PROFILE` parameter can
be specified to access external data, for example `?PROFILE=Hive`. Additional \<custom-options\>`
can be added to the LOCATION URI to further describe the external data format or storage options.
For 
 details about the plug-ins and profiles provided with PXF and information about creating
custom plug-ins for other data sources see [Using PXF with Unmanaged Data](../../pxf/HawqExtensionFrameworkPXF.html).</dd>
+For the `pxf` protocol, the `LOCATION` string specifies the HDFS NameNode \<host\>
and the \<port\> of the PXF service, the location of the data, and the PXF profile or
Java classes used to convert the data between storage format and HAWQ format. If the \<port\>
is omitted, the \<host\> is taken to be the logical name for the high availability Nameservice,
and the \<port\> is the value of the `pxf_service_port` configuration parameter, 51200
by default. The URL parameters `FRAGMENTER`, `ACCESSOR`, and `RESOLVER` are the names of PXF
plug-ins (Java classes) that convert between the external data format and HAWQ data format.
The `FRAGMENTER` parameter is only used with readable external tables. PXF allows combinations
of these parameters to be configured as profiles so that a single `PROFILE` parameter can
be specified to access external data, for example `?PROFILE=Hive`. Additional \<custom-options\>`
can be added to the LOCATION URI to further describe the external data format or st
 orage options. For details about the plug-ins and profiles provided with PXF and information
about creating custom plug-ins for other data sources see [Using PXF with Unmanaged Data](../../pxf/HawqExtensionFrameworkPXF.html).</dd>
 
 <dt>EXECUTE '\<command\>' ON ...  </dt>
 <dd>Allowed for readable web external tables or writable external tables only. For
readable web external tables, specifies the OS command to be executed by the segment instances.
The \<command\> can be a single OS command or a script. If \<command\> executes
a script, that script must reside in the same location on all of the segment hosts and be
executable by the HAWQ superuser (`gpadmin`).


Mime
View raw message