hawq-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yo...@apache.org
Subject [34/36] incubator-hawq-docs git commit: moving book configuration to new 'book' branch, for HAWQ-1027
Date Mon, 29 Aug 2016 16:47:09 GMT
http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/admin/monitor.html.md.erb
----------------------------------------------------------------------
diff --git a/admin/monitor.html.md.erb b/admin/monitor.html.md.erb
new file mode 100644
index 0000000..1e464e2
--- /dev/null
+++ b/admin/monitor.html.md.erb
@@ -0,0 +1,424 @@
+---
+title: Monitoring a HAWQ System
+---
+
+You can monitor a HAWQ system using a variety of tools included with the system or available as add-ons.
+
+Observing the HAWQ system day-to-day performance helps administrators understand the system behavior, plan workflow, and troubleshoot problems. This chapter discusses tools for monitoring database performance and activity.
+
+Also, be sure to review [Recommended Monitoring and Maintenance Tasks](RecommendedMonitoringTasks.html) for monitoring activities you can script to quickly detect problems in the system.
+
+## <a id="topic3"></a>Monitoring System State 
+
+As a HAWQ administrator, you must monitor the system for problem events such as a segment going down or running out of disk space on a segment host. The following topics describe how to monitor the health of a HAWQ system and examine certain state information for a HAWQ system.
+
+-   [Checking System State](#topic12)
+-   [Checking Disk Space Usage](#topic15)
+-   [Viewing Metadata Information about Database Objects](#topic24)
+-   [Viewing Query Workfile Usage Information](#topic27)
+
+### <a id="topic12"></a>Checking System State 
+
+A HAWQ system is comprised of multiple PostgreSQL instances \(the master and segments\) spanning multiple machines. To monitor a HAWQ system, you need to know information about the system as a whole, as well as status information of the individual instances. The `hawq state` utility provides status information about a HAWQ system.
+
+#### <a id="topic13"></a>Viewing Master and Segment Status and Configuration 
+
+The default `hawq state` action is to check segment instances and show a brief status of the valid and failed segments. For example, to see a quick status of your HAWQ system, type:
+
+```shell
+$ hawq state -b
+```
+
+You can also display information about the HAWQ master data directory by using `hawq state` with the `-d` option:
+
+```shell
+$ hawq state -d MASTER_DIR
+```
+
+### <a id="topic15"></a>Checking Disk Space Usage 
+
+#### <a id="topic16"></a>Checking Sizing of Distributed Databases and Tables 
+
+The `hawq_toolkit` administrative schema contains several views that you can use to determine the disk space usage for a distributed HAWQ database, schema, table, or index.
+
+##### <a id="topic17"></a>Viewing Disk Space Usage for a Database 
+
+To see the total size of a database \(in bytes\), use the *hawq\_size\_of\_database* view in the *hawq\_toolkit* administrative schema. For example:
+
+```sql
+=> SELECT * FROM hawq_toolkit.hawq_size_of_database
+ORDER BY sodddatname;
+```
+
+##### <a id="topic18"></a>Viewing Disk Space Usage for a Table 
+
+The *hawq\_toolkit* administrative schema contains several views for checking the size of a table. The table sizing views list the table by object ID \(not by name\). To check the size of a table by name, you must look up the relation name \(`relname`\) in the *pg\_class* table. For example:
+
+```sql
+=> SELECT relname AS name, sotdsize AS size, sotdtoastsize
+AS toast, sotdadditionalsize AS other
+FROM hawq_size_of_table_disk as sotd, pg_class
+WHERE sotd.sotdoid=pg_class.oid ORDER BY relname;
+```
+
+##### <a id="topic19"></a>Viewing Disk Space Usage for Indexes 
+
+The *hawq\_toolkit* administrative schema contains a number of views for checking index sizes. To see the total size of all index\(es\) on a table, use the *hawq\_size\_of\_all\_table\_indexes* view. To see the size of a particular index, use the *hawq\_size\_of\_index* view. The index sizing views list tables and indexes by object ID \(not by name\). To check the size of an index by name, you must look up the relation name \(`relname`\) in the *pg\_class* table. For example:
+
+```sql
+=> SELECT soisize, relname as indexname
+FROM pg_class, hawq_size_of_index
+WHERE pg_class.oid=hawq_size_of_index.soioid
+AND pg_class.relkind='i';
+```
+
+### <a id="topic24"></a>Viewing Metadata Information about Database Objects 
+
+HAWQ tracks various metadata information in its system catalogs about the objects stored in a database, such as tables, views, indexes and so on, as well as global objects such as roles and tablespaces.
+
+#### <a id="topic25"></a>Viewing the Last Operation Performed 
+
+You can use the system views *pg\_stat\_operations* and *pg\_stat\_partition\_operations* to look up actions performed on an object, such as a table. For example, to see the actions performed on a table, such as when it was created and when it was last vacuumed and analyzed:
+
+```sql
+=> SELECT schemaname as schema, objname as table,
+usename as role, actionname as action,
+subtype as type, statime as time
+FROM pg_stat_operations
+WHERE objname='cust';
+```
+
+```
+ schema | table | role | action  | type  | time
+--------+-------+------+---------+-------+--------------------------
+  sales | cust  | main | CREATE  | TABLE | 2010-02-09 18:10:07.867977-08
+  sales | cust  | main | VACUUM  |       | 2010-02-10 13:32:39.068219-08
+  sales | cust  | main | ANALYZE |       | 2010-02-25 16:07:01.157168-08
+(3 rows)
+
+```
+
+#### <a id="topic26"></a>Viewing the Definition of an Object 
+
+To see the definition of an object, such as a table or view, you can use the `\d+` meta-command when working in `psql`. For example, to see the definition of a table:
+
+<pre><code>=> \d+ <i>mytable</i></code></pre>
+
+
+### <a id="topic27"></a>Viewing Query Workfile Usage Information 
+
+The HAWQ administrative schema *hawq\_toolkit* contains views that display information about HAWQ workfiles. HAWQ creates workfiles on disk if it does not have sufficient memory to execute the query in memory. This information can be used for troubleshooting and tuning queries. The information in the views can also be used to specify the values for the HAWQ configuration parameters `hawq_workfile_limit_per_query` and `hawq_workfile_limit_per_segment`.
+
+These are the views in the schema *hawq\_toolkit*:
+
+-   The hawq\_workfile\_entries view contains one row for each operator using disk space for workfiles on a segment at the current time.
+-   The hawq\_workfile\_usage\_per\_query view contains one row for each query using disk space for workfiles on a segment at the current time.
+-   The hawq\_workfile\_usage\_per\_segment view contains one row for each segment. Each row displays the total amount of disk space used for workfiles on the segment at the current time.
+
+For information about using *hawq\_toolkit*, see [Using hawq\_toolkit](#topic31).
+
+## <a id="topic28"></a>Viewing the Database Server Log Files 
+
+Every database instance in HAWQ \(master and segments\) runs a PostgreSQL database server with its own server log file. Daily log files are created in the `pg_log` directory of the master and each segment data directory \(`$GPHOME/masterdd/pg_log` and `$GPHOME/segmentdd/pg_log`\).
+
+### <a id="topic29"></a>Log File Format 
+
+The server log files are written in comma-separated values \(CSV\) format. Some log entries will not have values for all log fields. For example, only log entries associated with a query worker process will have the `slice_id` populated. You can identify related log entries of a particular query by the query's session identifier \(`gp_session_id`\) and command identifier \(`gp_command_count`\).
+
+The following fields are written to the log:
+
+<table>
+  <tr><th>#</th><th>Field Name</th><th>Data Type</th><th>Description</th></tr>
+  <tr><td>1</td><td>event_time</td><td>timestamp with time zone</td><td>Time that the log entry was written to the log</td></tr>
+  <tr><td>2</td><td>user_name</td><td>varchar(100)</td><td>The database user name</td></tr>
+  <tr><td>3</td><td>database_name</td><td>varchar(100)</td><td>The database name</td></tr>
+  <tr><td>4</td><td>process_id</td><td>varchar(10)</td><td>The system process ID (prefixed with "p")</td></tr>
+  <tr><td>5</td><td>thread_id</td><td>varchar(50)</td><td>The thread count (prefixed with "th")</td></tr>
+  <tr><td>6</td><td>remote_host</td><td>varchar(100)</td><td>On the master, the hostname/address of the client machine. On the segment, the hostname/address of the master.</td></tr>
+  <tr><td>7</td><td>remote_port</td><td>varchar(10)</td><td>The segment or master port number</td></tr>
+  <tr><td>8</td><td>session_start_time</td><td>timestamp with time zone</td><td>Time session connection was opened</td></tr>
+  <tr><td>9</td><td>transaction_id</td><td>int</td><td>Top-level transaction ID on the master. This ID is the parent of any subtransactions.</td></tr>
+  <tr><td>10</td><td>gp_session_id</td><td>text</td><td>Session identifier number (prefixed with "con")</td></tr>
+  <tr><td>11</td><td>gp_command_count</td><td>text</td><td>The command number within a session (prefixed with "cmd")</td></tr>
+  <tr><td>12</td><td>gp_segment</td><td>text</td><td>The segment content identifier. The master always has a content ID of -1.</td></tr>
+  <tr><td>13</td><td>slice_id</td><td>text</td><td>The slice ID (portion of the query plan being executed)</td></tr>
+  <tr><td>14</td><td>distr_tranx_id</td><td>text</td><td>Distributed transaction ID</td></tr>
+  <tr><td>15</td><td>local_tranx_id</td><td>text</td><td>Local transaction ID</td></tr>
+  <tr><td>16</td><td>sub_tranx_id</td><td>text</td><td>Subtransaction ID</td></tr>
+  <tr><td>17</td><td>event_severity</td><td>varchar(10)</td><td>Values include: LOG, ERROR, FATAL, PANIC, DEBUG1, DEBUG2</td></tr>
+  <tr><td>18</td><td>sql_state_code</td><td>varchar(10)</td><td>SQL state code associated with the log message</td></tr>
+  <tr><td>19</td><td>event_message</td><td>text</td><td>Log or error message text</td></tr>
+  <tr><td>20</td><td>event_detail</td><td>text</td><td>Detail message text associated with an error or warning message</td></tr>
+  <tr><td>21</td><td>event_hint</td><td>text</td><td>Hint message text associated with an error or warning message</td></tr>
+  <tr><td>22</td><td>internal_query</td><td>text</td><td>The internally-generated query text</td></tr>
+  <tr><td>23</td><td>internal_query_pos</td><td>int</td><td>The cursor index into the internally-generated query text</td></tr>
+  <tr><td>24</td><td>event_context</td><td>text</td><td>The context in which this message gets generated</td></tr>
+  <tr><td>25</td><td>debug_query_string</td><td>text</td><td>User-supplied query string with full detail for debugging. This string can be modified for internal use.</td></tr>
+  <tr><td>26</td><td>error_cursor_pos</td><td>int</td><td>The cursor index into the query string</td></tr>
+  <tr><td>27</td><td>func_name</td><td>text</td><td>The function in which this message is generated</td></tr>
+  <tr><td>28</td><td>file_name</td><td>text</td><td>The internal code file where the message originated</td></tr>
+  <tr><td>29</td><td>file_line</td><td>int</td><td>The line of the code file where the message originated</td></tr>
+  <tr><td>30</td><td>stack_trace</td><td>text</td><td>Stack trace text associated with this message</td></tr>
+</table>
+### <a id="topic30"></a>Searching the HAWQ Server Log Files 
+
+HAWQ provides a utility called `gplogfilter` can search through a HAWQ log file for entries matching the specified criteria. By default, this utility searches through the HAWQ master log file in the default logging location. For example, to display the entries to the master log file starting after 2 pm on a certain date:
+
+```shell
+$ gplogfilter -b '2016-01-18 14:00'
+```
+
+To search through all segment log files simultaneously, run `gplogfilter` through the `hawq ssh` utility. For example, specify the seg\_host\_log\_file that contains hosts to participate in the session, then use `gplogfilter` to display the last three lines of each segment log file:
+
+```shell
+$ hawq ssh -f seg_host_log_file
+=> source ~/greenplum_path.sh
+=> gplogfilter -n 3 /data/hawq-install-path/segmentdd/pg_log/hawq*.csv
+```
+
+## <a id="topic31"></a>Using hawq\_toolkit 
+
+Use HAWQ's administrative schema *hawq\_toolkit* to query the system catalogs, log files, and operating environment for system status information. The `hawq_toolkit` schema contains several views you can access using SQL commands. The *hawq\_toolkit* schema is accessible to all database users. Some objects require superuser permissions. Use a command similar to the following to add the *hawq\_toolkit* schema to your schema search path:
+
+```sql
+=> ALTER ROLE myrole SET search_path TO myschema,hawq_toolkit;
+```
+
+## <a id="topic_jx2_rqg_kp"></a>HAWQ Error Codes 
+
+The following section describes SQL error codes for certain database events.
+
+### <a id="topic_pyh_sqg_kp"></a>SQL Standard Error Codes 
+
+The following table lists all the defined error codes. Some are not used, but are defined by the SQL standard. The error classes are also shown. For each error class there is a standard error code having the last three characters 000. This code is used only for error conditions that fall within the class but do not have any more-specific code assigned.
+
+The PL/pgSQL condition name for each error code is the same as the phrase shown in the table, with underscores substituted for spaces. For example, code 22012, DIVISION BY ZERO, has condition name DIVISION\_BY\_ZERO. Condition names can be written in either upper or lower case.
+
+**Note:** PL/pgSQL does not recognize warning, as opposed to error, condition names; those are classes 00, 01, and 02.
+
+|Error Code|Meaning|Constant|
+|----------|-------|--------|
+|**Class 00**— Successful Completion|
+|00000|SUCCESSFUL COMPLETION|successful\_completion|
+|Class 01 — Warning|
+|01000|WARNING|warning|
+|0100C|DYNAMIC RESULT SETS RETURNED|dynamic\_result\_sets\_returned|
+|01008|IMPLICIT ZERO BIT PADDING|implicit\_zero\_bit\_padding|
+|01003|NULL VALUE ELIMINATED IN SET FUNCTION|null\_value\_eliminated\_in\_set\_function|
+|01007|PRIVILEGE NOT GRANTED|privilege\_not\_granted|
+|01006|PRIVILEGE NOT REVOKED|privilege\_not\_revoked|
+|01004|STRING DATA RIGHT TRUNCATION|string\_data\_right\_truncation|
+|01P01|DEPRECATED FEATURE|deprecated\_feature|
+|**Class 02** — No Data \(this is also a warning class per the SQL standard\)|
+|02000|NO DATA|no\_data|
+|02001|NO ADDITIONAL DYNAMIC RESULT SETS RETURNED|no\_additional\_dynamic\_result\_sets\_returned|
+|**Class 03** — SQL Statement Not Yet Complete|
+|03000|SQL STATEMENT NOT YET COMPLETE|sql\_statement\_not\_yet\_complete|
+|**Class 08** — Connection Exception|
+|08000|CONNECTION EXCEPTION|connection\_exception|
+|08003|CONNECTION DOES NOT EXIST|connection\_does\_not\_exist|
+|08006|CONNECTION FAILURE|connection\_failure|
+|08001|SQLCLIENT UNABLE TO ESTABLISH SQLCONNECTION|sqlclient\_unable\_to\_establish\_sqlconnection|
+|08004|SQLSERVER REJECTED ESTABLISHMENT OF SQLCONNECTION|sqlserver\_rejected\_establishment\_of\_sqlconnection|
+|08007|TRANSACTION RESOLUTION UNKNOWN|transaction\_resolution\_unknown|
+|08P01|PROTOCOL VIOLATION|protocol\_violation|
+|**Class 09** — Triggered Action Exception|
+|09000|TRIGGERED ACTION EXCEPTION|triggered\_action\_exception|
+|**Class 0A** — Feature Not Supported|
+|0A000|FEATURE NOT SUPPORTED|feature\_not\_supported|
+|**Class 0B** — Invalid Transaction Initiation|
+|0B000|INVALID TRANSACTION INITIATION|invalid\_transaction\_initiation|
+|**Class 0F** — Locator Exception|
+|0F000|LOCATOR EXCEPTION|locator\_exception|
+|0F001|INVALID LOCATOR SPECIFICATION|invalid\_locator\_specification|
+|**Class 0L** — Invalid Grantor|
+|0L000|INVALID GRANTOR|invalid\_grantor|
+|0LP01|INVALID GRANT OPERATION|invalid\_grant\_operation|
+|**Class 0P** — Invalid Role Specification|
+|0P000|INVALID ROLE SPECIFICATION|invalid\_role\_specification|
+|**Class 21** — Cardinality Violation|
+|21000|CARDINALITY VIOLATION|cardinality\_violation|
+|**Class 22** — Data Exception|
+|22000|DATA EXCEPTION|data\_exception|
+|2202E|ARRAY SUBSCRIPT ERROR|array\_subscript\_error|
+|22021|CHARACTER NOT IN REPERTOIRE|character\_not\_in\_repertoire|
+|22008|DATETIME FIELD OVERFLOW|datetime\_field\_overflow|
+|22012|DIVISION BY ZERO|division\_by\_zero|
+|22005|ERROR IN ASSIGNMENT|error\_in\_assignment|
+|2200B|ESCAPE CHARACTER CONFLICT|escape\_character\_conflict|
+|22022|INDICATOR OVERFLOW|indicator\_overflow|
+|22015|INTERVAL FIELD OVERFLOW|interval\_field\_overflow|
+|2201E|INVALID ARGUMENT FOR LOGARITHM|invalid\_argument\_for\_logarithm|
+|2201F|INVALID ARGUMENT FOR POWER FUNCTION|invalid\_argument\_for\_power\_function|
+|2201G|INVALID ARGUMENT FOR WIDTH BUCKET FUNCTION|invalid\_argument\_for\_width\_bucket\_function|
+|22018|INVALID CHARACTER VALUE FOR CAST|invalid\_character\_value\_for\_cast|
+|22007|INVALID DATETIME FORMAT|invalid\_datetime\_format|
+|22019|INVALID ESCAPE CHARACTER|invalid\_escape\_character|
+|2200D|INVALID ESCAPE OCTET|invalid\_escape\_octet|
+|22025|INVALID ESCAPE SEQUENCE|invalid\_escape\_sequence|
+|22P06|NONSTANDARD USE OF ESCAPE CHARACTER|nonstandard\_use\_of\_escape\_character|
+|22010|INVALID INDICATOR PARAMETER VALUE|invalid\_indicator\_parameter\_value|
+|22020|INVALID LIMIT VALUE|invalid\_limit\_value|
+|22023|INVALID PARAMETER VALUE|invalid\_parameter\_value|
+|2201B|INVALID REGULAR EXPRESSION|invalid\_regular\_expression|
+|22009|INVALID TIME ZONE DISPLACEMENT VALUE|invalid\_time\_zone\_displacement\_value|
+|2200C|INVALID USE OF ESCAPE CHARACTER|invalid\_use\_of\_escape\_character|
+|2200G|MOST SPECIFIC TYPE MISMATCH|most\_specific\_type\_mismatch|
+|22004|NULL VALUE NOT ALLOWED|null\_value\_not\_allowed|
+|22002|NULL VALUE NO INDICATOR PARAMETER|null\_value\_no\_indicator\_parameter|
+|22003|NUMERIC VALUE OUT OF RANGE|numeric\_value\_out\_of\_range|
+|22026|STRING DATA LENGTH MISMATCH|string\_data\_length\_mismatch|
+|22001|STRING DATA RIGHT TRUNCATION|string\_data\_right\_truncation|
+|22011|SUBSTRING ERROR|substring\_error|
+|22027|TRIM ERROR|trim\_error|
+|22024|UNTERMINATED C STRING|unterminated\_c\_string|
+|2200F|ZERO LENGTH CHARACTER STRING|zero\_length\_character\_string|
+|22P01|FLOATING POINT EXCEPTION|floating\_point\_exception|
+|22P02|INVALID TEXT REPRESENTATION|invalid\_text\_representation|
+|22P03|INVALID BINARY REPRESENTATION|invalid\_binary\_representation|
+|22P04|BAD COPY FILE FORMAT|bad\_copy\_file\_format|
+|22P05|UNTRANSLATABLE CHARACTER|untranslatable\_character|
+|**Class 23** — Integrity Constraint Violation|
+|23000|INTEGRITY CONSTRAINT VIOLATION|integrity\_constraint\_violation|
+|23001|RESTRICT VIOLATION|restrict\_violation|
+|23502|NOT NULL VIOLATION|not\_null\_violation|
+|23503|FOREIGN KEY VIOLATION|foreign\_key\_violation|
+|23505|UNIQUE VIOLATION|unique\_violation|
+|23514|CHECK VIOLATION|check\_violation|
+|**Class 24** — Invalid Cursor State|
+|24000|INVALID CURSOR STATE|invalid\_cursor\_state|
+|**Class 25** — Invalid Transaction State|
+|25000|INVALID TRANSACTION STATE|invalid\_transaction\_state|
+|25001|ACTIVE SQL TRANSACTION|active\_sql\_transaction|
+|25002|BRANCH TRANSACTION ALREADY ACTIVE|branch\_transaction\_already\_active|
+|25008|HELD CURSOR REQUIRES SAME ISOLATION LEVEL|held\_cursor\_requires\_same\_isolation\_level|
+|25003|INAPPROPRIATE ACCESS MODE FOR BRANCH TRANSACTION|inappropriate\_access\_mode\_for\_branch\_transaction|
+|25004|INAPPROPRIATE ISOLATION LEVEL FOR BRANCH TRANSACTION|inappropriate\_isolation\_level\_for\_branch\_transaction|
+|25005|NO ACTIVE SQL TRANSACTION FOR BRANCH TRANSACTION|no\_active\_sql\_transaction\_for\_branch\_transaction|
+|25006|READ ONLY SQL TRANSACTION|read\_only\_sql\_transaction|
+|25007|SCHEMA AND DATA STATEMENT MIXING NOT SUPPORTED|schema\_and\_data\_statement\_mixing\_not\_supported|
+|25P01|NO ACTIVE SQL TRANSACTION|no\_active\_sql\_transaction|
+|25P02|IN FAILED SQL TRANSACTION|in\_failed\_sql\_transaction|
+|**Class 26** — Invalid SQL Statement Name|
+|26000|INVALID SQL STATEMENT NAME|invalid\_sql\_statement\_name|
+|**Class 27** — Triggered Data Change Violation|
+|27000|TRIGGERED DATA CHANGE VIOLATION|triggered\_data\_change\_violation|
+|**Class 28** — Invalid Authorization Specification|
+|28000|INVALID AUTHORIZATION SPECIFICATION|invalid\_authorization\_specification|
+|**Class 2B** — Dependent Privilege Descriptors Still Exist|
+|2B000|DEPENDENT PRIVILEGE DESCRIPTORS STILL EXIST|dependent\_privilege\_descriptors\_still\_exist|
+|2BP01|DEPENDENT OBJECTS STILL EXIST|dependent\_objects\_still\_exist|
+|**Class 2D** — Invalid Transaction Termination|
+|2D000|INVALID TRANSACTION TERMINATION|invalid\_transaction\_termination|
+|**Class 2F** — SQL Routine Exception|
+|2F000|SQL ROUTINE EXCEPTION|sql\_routine\_exception|
+|2F005|FUNCTION EXECUTED NO RETURN STATEMENT|function\_executed\_no\_return\_statement|
+|2F002|MODIFYING SQL DATA NOT PERMITTED|modifying\_sql\_data\_not\_permitted|
+|2F003|PROHIBITED SQL STATEMENT ATTEMPTED|prohibited\_sql\_statement\_attempted|
+|2F004|READING SQL DATA NOT PERMITTED|reading\_sql\_data\_not\_permitted|
+|**Class 34** — Invalid Cursor Name|
+|34000|INVALID CURSOR NAME|invalid\_cursor\_name|
+|**Class 38** — External Routine Exception|
+|38000|EXTERNAL ROUTINE EXCEPTION|external\_routine\_exception|
+|38001|CONTAINING SQL NOT PERMITTED|containing\_sql\_not\_permitted|
+|38002|MODIFYING SQL DATA NOT PERMITTED|modifying\_sql\_data\_not\_permitted|
+|38003|PROHIBITED SQL STATEMENT ATTEMPTED|prohibited\_sql\_statement\_attempted|
+|38004|READING SQL DATA NOT PERMITTED|reading\_sql\_data\_not\_permitted|
+|**Class 39** — External Routine Invocation Exception|
+|39000|EXTERNAL ROUTINE INVOCATION EXCEPTION|external\_routine\_invocation\_exception|
+|39001|INVALID SQLSTATE RETURNED|invalid\_sqlstate\_returned|
+|39004|NULL VALUE NOT ALLOWED|null\_value\_not\_allowed|
+|39P01|TRIGGER PROTOCOL VIOLATED|trigger\_protocol\_violated|
+|39P02|SRF PROTOCOL VIOLATED|srf\_protocol\_violated|
+|**Class 3B** — Savepoint Exception|
+|3B000|SAVEPOINT EXCEPTION|savepoint\_exception|
+|3B001|INVALID SAVEPOINT SPECIFICATION|invalid\_savepoint\_specification|
+|**Class 3D** — Invalid Catalog Name|
+|3D000|INVALID CATALOG NAME|invalid\_catalog\_name|
+|**Class 3F** — Invalid Schema Name|
+|3F000|INVALID SCHEMA NAME|invalid\_schema\_name|
+|**Class 40** — Transaction Rollback|
+|40000|TRANSACTION ROLLBACK|transaction\_rollback|
+|40002|TRANSACTION INTEGRITY CONSTRAINT VIOLATION|transaction\_integrity\_constraint\_violation|
+|40001|SERIALIZATION FAILURE|serialization\_failure|
+|40003|STATEMENT COMPLETION UNKNOWN|statement\_completion\_unknown|
+|40P01|DEADLOCK DETECTED|deadlock\_detected|
+|**Class 42** — Syntax Error or Access Rule Violation|
+|42000|SYNTAX ERROR OR ACCESS RULE VIOLATION|syntax\_error\_or\_access\_rule\_violation|
+|42601|SYNTAX ERROR|syntax\_error|
+|42501|INSUFFICIENT PRIVILEGE|insufficient\_privilege|
+|42846|CANNOT COERCE|cannot\_coerce|
+|42803|GROUPING ERROR|grouping\_error|
+|42830|INVALID FOREIGN KEY|invalid\_foreign\_key|
+|42602|INVALID NAME|invalid\_name|
+|42622|NAME TOO LONG|name\_too\_long|
+|42939|RESERVED NAME|reserved\_name|
+|42804|DATATYPE MISMATCH|datatype\_mismatch|
+|42P18|INDETERMINATE DATATYPE|indeterminate\_datatype|
+|42809|WRONG OBJECT TYPE|wrong\_object\_type|
+|42703|UNDEFINED COLUMN|undefined\_column|
+|42883|UNDEFINED FUNCTION|undefined\_function|
+|42P01|UNDEFINED TABLE|undefined\_table|
+|42P02|UNDEFINED PARAMETER|undefined\_parameter|
+|42704|UNDEFINED OBJECT|undefined\_object|
+|42701|DUPLICATE COLUMN|duplicate\_column|
+|42P03|DUPLICATE CURSOR|duplicate\_cursor|
+|42P04|DUPLICATE DATABASE|duplicate\_database|
+|42723|DUPLICATE FUNCTION|duplicate\_function|
+|42P05|DUPLICATE PREPARED STATEMENT|duplicate\_prepared\_statement|
+|42P06|DUPLICATE SCHEMA|duplicate\_schema|
+|42P07|DUPLICATE TABLE|duplicate\_table|
+|42712|DUPLICATE ALIAS|duplicate\_alias|
+|42710|DUPLICATE OBJECT|duplicate\_object|
+|42702|AMBIGUOUS COLUMN|ambiguous\_column|
+|42725|AMBIGUOUS FUNCTION|ambiguous\_function|
+|42P08|AMBIGUOUS PARAMETER|ambiguous\_parameter|
+|42P09|AMBIGUOUS ALIAS|ambiguous\_alias|
+|42P10|INVALID COLUMN REFERENCE|invalid\_column\_reference|
+|42611|INVALID COLUMN DEFINITION|invalid\_column\_definition|
+|42P11|INVALID CURSOR DEFINITION|invalid\_cursor\_definition|
+|42P12|INVALID DATABASE DEFINITION|invalid\_database\_definition|
+|42P13|INVALID FUNCTION DEFINITION|invalid\_function\_definition|
+|42P14|INVALID PREPARED STATEMENT DEFINITION|invalid\_prepared\_statement\_definition|
+|42P15|INVALID SCHEMA DEFINITION|invalid\_schema\_definition|
+|42P16|INVALID TABLE DEFINITION|invalid\_table\_definition|
+|42P17|INVALID OBJECT DEFINITION|invalid\_object\_definition|
+|**Class 44** — WITH CHECK OPTION Violation|
+|44000|WITH CHECK OPTION VIOLATION|with\_check\_option\_violation|
+|**Class 53** — Insufficient Resources|
+|53000|INSUFFICIENT RESOURCES|insufficient\_resources|
+|53100|DISK FULL|disk\_full|
+|53200|OUT OF MEMORY|out\_of\_memory|
+|53300|TOO MANY CONNECTIONS|too\_many\_connections|
+|**Class 54** — Program Limit Exceeded|
+|54000|PROGRAM LIMIT EXCEEDED|program\_limit\_exceeded|
+|54001|STATEMENT TOO COMPLEX|statement\_too\_complex|
+|54011|TOO MANY COLUMNS|too\_many\_columns|
+|54023|TOO MANY ARGUMENTS|too\_many\_arguments|
+|**Class 55** — Object Not In Prerequisite State|
+|55000|OBJECT NOT IN PREREQUISITE STATE|object\_not\_in\_prerequisite\_state|
+|55006|OBJECT IN USE|object\_in\_use|
+|55P02|CANT CHANGE RUNTIME PARAM|cant\_change\_runtime\_param|
+|55P03|LOCK NOT AVAILABLE|lock\_not\_available|
+|**Class 57** — Operator Intervention|
+|57000|OPERATOR INTERVENTION|operator\_intervention|
+|57014|QUERY CANCELED|query\_canceled|
+|57P01|ADMIN SHUTDOWN|admin\_shutdown|
+|57P02|CRASH SHUTDOWN|crash\_shutdown|
+|57P03|CANNOT CONNECT NOW|cannot\_connect\_now|
+|**Class 58** — System Error \(errors external to HAWQ \)|
+|58030|IO ERROR|io\_error|
+|58P01|UNDEFINED FILE|undefined\_file|
+|58P02|DUPLICATE FILE|duplicate\_file|
+|Class F0 — Configuration File Error|
+|F0000|CONFIG FILE ERROR|config\_file\_error|
+|F0001|LOCK FILE EXISTS|lock\_file\_exists|
+|**Class P0** — PL/pgSQL Error|
+|P0000|PLPGSQL ERROR|plpgsql\_error|
+|P0001|RAISE EXCEPTION|raise\_exception|
+|P0002|NO DATA FOUND|no\_data\_found|
+|P0003|TOO MANY ROWS|too\_many\_rows|
+|**Class XX** — Internal Error|
+|XX000|INTERNAL ERROR|internal\_error|
+|XX001|DATA CORRUPTED|data\_corrupted|
+|XX002|INDEX CORRUPTED|index\_corrupted|

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/admin/startstop.html.md.erb
----------------------------------------------------------------------
diff --git a/admin/startstop.html.md.erb b/admin/startstop.html.md.erb
new file mode 100644
index 0000000..106fa7d
--- /dev/null
+++ b/admin/startstop.html.md.erb
@@ -0,0 +1,157 @@
+---
+title: Starting and Stopping HAWQ
+---
+
+In a HAWQ DBMS, the database server instances \(the master and all segments\) are started or stopped across all of the hosts in the system in such a way that they can work together as a unified DBMS.
+
+Because a HAWQ system is distributed across many machines, the process for starting and stopping a HAWQ system is different than the process for starting and stopping a regular PostgreSQL DBMS.
+
+Use the `hawq start `*`object`* and `hawq stop `*`object`* commands to start and stop HAWQ, respectively. These management tools are located in the $GPHOME/bin directory on your HAWQ master host. Initializing a HAWQ system also starts it.
+
+**Important:**
+
+Do not issue a `KILL` command to end any Postgres process. Instead, use the database command `pg_cancel_backend()`.
+
+For information about [hawq start](/20/reference/cli/admin_utilities/hawqstart.html) and [hawq stop](/20/reference/cli/admin_utilities/hawqstop.html), see the appropriate pages in the HAWQ Management Utility Reference or enter `hawq start -h` or `hawq stop -h` on the command line.
+
+## <a id="task_g1y_xtm_s5"></a>Initialize HAWQ 
+
+Initialize and start the HAWQ system using configuration parameters defined in `$GPHOME/etc/hawq-site.xml`.
+
+The `hawq init` command with the appropriate cluster or node command initializes and starts a HAWQ cluster. The master or segment nodes can be individually initialized by using `hawq init master` and `hawq init segment` commands, respectively. Format options can also be specified at this time.
+
+The `hawq init <object>` utility will create a HAWQ instance using configuration parameters defined in `$GPHOME/etc/hawq-site.xml` A single node cluster can be started without any user-defined changes to the default `hawq-site.xml` file. Use the template-hawq-site.xml file to specify the configuration for larger clusters.
+
+When using the template for initializing a new cluster configuration, replace the items contained within the % markers, for example in: *`value`*`%master.host%`*`value`*, `%master.host%` would be replaced with the master host name. After modification, rename the file to the name of the default configuration file: `hawq-site.xml`.
+
+-   Before initializing HAWQ, set the `$GPHOME` environment variable to point to the location of your HAWQ installation on the master host and exchange SSH keys between all host addresses in the array, using `hawq ssh-exkeys`.
+-   To initialize and start a HAWQ cluster, enter the following command on the master host:
+
+    ```shell
+    $ hawq init cluster
+    ```
+
+
+## <a id="task_hkd_gzv_fp"></a>Starting HAWQ 
+
+Start an initialized HAWQ system by running the `hawq start` command on the master instance.
+
+Use the `hawq start cluster` command to start a HAWQ system that has already been initialized by the `hawq init cluster` command, but has been stopped by the `hawq stop cluster` command. The `hawq start cluster` command starts HAWQ by starting all the segments on the HAWQ cluster. `hawq start cluster` orchestrates this process and performs the process in parallel.
+
+You can also use the `hawq start master` command to start only the HAWQ master, without segment nodes, then add these later, using `hawq start segment`. If you want HAWQ to ignore hosts that fail ssh validation, use the hawq start `--ignore-bad-hosts` option. 
+
+-   Run `hawq start cluster` on the master host to start a HAWQ system:
+
+    ```shell
+    $ hawq start cluster
+    ```
+
+    **Note:**
+
+    When the HAWQ system is first initialized with the `hawq init` command, it is automatically started.
+
+
+## <a id="task_gpdb_restart"></a>Restarting HAWQ 
+
+Stop the HAWQ system and then restart it.
+
+The `hawq restart` command with the appropriate cluster or node command can stop and then restart HAWQ after the shutdown completes. If the master or segments are already stopped, restart will have no effect.
+
+-   To restart a HAWQ cluster, enter the following command on the master host:
+
+    ```shell
+    $ hawq restart cluster
+    ```
+
+
+## <a id="task_upload_config"></a>Reloading Configuration File Changes Only 
+
+Reload changes to the HAWQ configuration files without interrupting the system.
+
+The `hawq stop` command can reload changes to the pg\_hba.conf configuration file and to *runtime* parameters in the hawq-site.xml file and pg\_hba.conf file without service interruption. Active sessions pick up changes when they reconnect to the database. Many server configuration parameters require a full system restart \(`hawq restart cluster`\) to activate. For information about server configuration parameters, see the [Server Configuration Parameter Reference](/20/reference/guc/guc_config.html).
+
+-   Reload configuration file changes without shutting down the system using the `hawq stop` command:
+
+    ```shell
+    $ hawq stop --reload
+    ```
+
+
+## <a id="task_maint_mode"></a>Starting the Master in Maintenance Mode 
+
+Start only the master to perform maintenance or administrative tasks without affecting data on the segments.
+
+Maintenance mode is a superuser-only mode that should only be used when required for a particular maintenance task. For example, you can connect to a database only on the master instance in maintenance mode and edit system catalog settings.
+
+1.  Run `hawq start` using the master -m option:
+
+    ```shell
+    $ hawq start master -m
+    ```
+
+2.  Connect to the master in maintenance mode to do catalog maintenance. For example:
+
+    ```shell
+    $ PGOPTIONS='-c gp_session_role=utility' psql template1
+    ```
+3.  After completing your administrative tasks, restart the master in production mode. 
+
+    ```shell
+    $ hawq restart master 
+    ```
+
+    **Warning:**
+
+    Incorrect use of maintenance mode connections can result in an inconsistent system state. Only expert users should perform this operation.
+
+
+## <a id="task_gpdb_stop"></a>Stopping HAWQ 
+
+The `hawq stop cluster` command stops or restarts your HAWQ system and always runs on the master host. When activated, `hawq stop cluster` stops all `postgres` processes in the system, including the master and all segment instances. The `hawq stop cluster` command uses a default of up to 64 parallel worker threads to bring down the segments that make up the HAWQ cluster. The system waits for any active transactions to finish before shutting down. To stop HAWQ immediately, use fast mode. The commands `hawq stop master`, `hawq stop segment`, `hawq stop standby`, or `hawq stop allsegments` can be used to stop the master, the local segment node, standby, or all segments in the cluster. Stopping the master will stop only the master segment, and will not shut down a cluster.
+
+-   To stop HAWQ:
+
+    ```shell
+    $ hawq stop cluster
+    ```
+
+-   To stop HAWQ in fast mode:
+
+    ```shell
+    $ hawq stop cluster -M fast
+    ```
+
+
+## <a id="task_tx4_bl3_h5"></a>Best Practices to Start/Stop HAWQ Cluster Members 
+
+For best results in using `hawq start` and `hawq stop` to manage your HAWQ system, the following best practices are recommended.
+
+-   Issue the `CHECKPOINT` command to update and flush all data files to disk and update the log file before stopping the cluster. A checkpoint ensures that, in the event of a crash, files can be restored from the checkpoint snapshot.
+
+-   Stop the entire HAWQ system by stopping the cluster on the master host. 
+
+    ```shell
+    $ hawq stop cluster
+    ```
+
+-   To stop segments and kill any running queries without causing data loss or inconsistency issues, use `fast` or `immediate` mode on the cluster:
+
+    ```shell
+    $ hawq stop cluster -M fast
+    $ hawq stop cluster -M immediate
+    ```
+
+-   Use `hawq stop master` to stop the master only. If you cannot stop the master due to running transactions, try using fast shutdown. If fast shutdown does not work, use immediate shutdown. Use immediate shutdown with caution, as it will result in a crash-recovery run when the system is restarted.
+
+	```shell
+    $ hawq stop master -M fast
+    $ hawq stop master -M immediate
+    ```
+
+-   When stopping a segment or all segments, you can use the default mode of smart mode. Using fast or immediate mode on segments will have no effect since segments are stateless.
+
+    ```shell
+    $ hawq stop segment
+    $ hawq stop allsegments
+    ```
+-	Typically you should always use `hawq start cluster` or `hawq restart cluster` to start the cluster. If you do end up using `hawq start standby|master|segment` to start nodes individually, make sure you always start the standby before the active master. Otherwise, the standby can become unsynchronized with the active master.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/bestpractices/HAWQBestPracticesOverview.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/HAWQBestPracticesOverview.html.md.erb b/bestpractices/HAWQBestPracticesOverview.html.md.erb
new file mode 100644
index 0000000..6277727
--- /dev/null
+++ b/bestpractices/HAWQBestPracticesOverview.html.md.erb
@@ -0,0 +1,31 @@
+---
+title: Best Practices
+---
+
+This chapter provides best practices on using the components and features that are part of a HAWQ system.
+
+-   **[HAWQ Best Practices](../bestpractices/general_bestpractices.html)**
+
+    This topic addresses general best practices for using HAWQ.
+
+-   **[Best Practices for Operating HAWQ](../bestpractices/operating_hawq_bestpractices.html)**
+
+    This topic provides best practices for operating HAWQ, including recommendations for stopping, starting and monitoring HAWQ.
+
+-   **[Best Practices for Securing HAWQ](../bestpractices/secure_bestpractices.html)**
+
+    To secure your HAWQ deployment, review the recommendations listed in this topic.
+
+-   **[Best Practices for Managing Resources](../bestpractices/managing_resources_bestpractices.html)**
+
+    This topic describes best practices for managing resources in HAWQ.
+
+-   **[Best Practices for Managing Data](../bestpractices/managing_data_bestpractices.html)**
+
+    This topic describes best practices for creating databases, loading data, partioning data, and recovering data in HAWQ.
+
+-   **[Best Practices for Querying Data](../bestpractices/querying_data_bestpractices.html)**
+
+    To obtain the best results when querying data in HAWQ, review the best practices described in this topic.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/bestpractices/general_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/general_bestpractices.html.md.erb b/bestpractices/general_bestpractices.html.md.erb
new file mode 100644
index 0000000..6c663c3
--- /dev/null
+++ b/bestpractices/general_bestpractices.html.md.erb
@@ -0,0 +1,24 @@
+---
+title: HAWQ Best Practices
+---
+
+This topic addresses general best practices for users who are new to HAWQ.
+
+When using HAWQ, adhere to the following guidelines for best results:
+
+-   **Use a consistent `hawq-site.xml` file to configure your entire cluster**:
+
+    Configuration guc/parameters are located in `$GPHOME/etc/hawq-site.xml`. This configuration file resides on all HAWQ instances and can be modified by using the `hawq config` utility. You can use the same configuration file cluster-wide across both master and segments.
+
+    **Note:** While `postgresql.conf` still exists in HAWQ, any parameters defined in `hawq-site.xml` will overwrite configurations in `postgresql.conf`. For this reason, we recommend that you only use `hawq-site.xml` to configure your HAWQ cluster.
+
+-   **Keep in mind the factors that impact the number of virtual segments used for queries. The number of virtual segments used directly impacts the query's performance.** The degree of parallelism achieved by a query is determined by multiple factors, including the following:
+    -   **Cost of the query**. Small queries use fewer segments and larger queries use more segments. Note that there are some techniques you can use when defining resource queues to influence the number of virtual segments and general resources that are allocated to queries. See [Best Practices for Using Resource Queues](managing_resources_bestpractices.html#topic_hvd_pls_wv).
+    -   **Available resources**. Resources available at query time. If more resources are available in the resource queue, the resources will be used.
+    -   **Hash table and bucket number**. If the query involves only hash-distributed tables, and the bucket number (bucketnum) configured for all the hash tables is either the same bucket number for all tables or the table size for random tables is no more than 1.5 times larger than the size of hash tables for the hash tables, then the query's parallelism is fixed (equal to the hash table bucket number). Otherwise, the number of virtual segments depends on the query's cost and hash-distributed table queries will behave like queries on randomly distributed tables.
+    -   **Query Type**: For queries with some user-defined functions or for external tables where calculating resource costs is difficult , then the number of virtual segments is controlled by `hawq_rm_nvseg_perquery_limit `and `hawq_rm_nvseg_perquery_perseg_limit` parameters, as well as by the ON clause and the location list of external tables. If the query has a hash result table (e.g. `INSERT into hash_table`) then the number of virtual segment number must be equal to the bucket number of the resulting hash table, If the query is performed in utility mode, such as for `COPY` and `ANALYZE` operations, the virtual segment number is calculated by different policies, which will be explained later in this section.
+    -   **PXF**: PXF external tables use the `default_hash_table_bucket_number` parameter, not the `hawq_rm_nvseg_perquery_perseg_limit` parameter, to control the number of virtual segments. 
+
+    See [Query Performance](../query/query-performance.html#topic38) for more details.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/bestpractices/managing_data_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/managing_data_bestpractices.html.md.erb b/bestpractices/managing_data_bestpractices.html.md.erb
new file mode 100644
index 0000000..01f82cd
--- /dev/null
+++ b/bestpractices/managing_data_bestpractices.html.md.erb
@@ -0,0 +1,47 @@
+---
+title: Best Practices for Managing Data
+---
+
+This topic describes best practices for creating databases, loading data, partioning data, and recovering data in HAWQ.
+
+## <a id="topic_xhy_v2j_1v"></a>Best Practices for Loading Data
+
+Loading data into HDFS is challenging due to the limit on the number of files that can be opened concurrently for write on both NameNodes and DataNodes.
+
+To obtain the best performance during data loading, observe the following best practices:
+
+-   Typically the number of concurrent connections to a NameNode should not exceed 50,000, and the number of open files per DataNode should not exceed 10,000. If you exceed these limits, NameNode and DataNode may become overloaded and slow.
+-   If the number of partitions in a table is large, the recommended way to load data into the partitioned table is to load the data partition by partition. For example, you can use query such as the following to load data into only one partition:
+
+    ```sql
+    insert into target_partitioned_table_part1 select * from source_table where filter
+    ```
+
+    where *filter* selects only the data in the target partition.
+
+-   To alleviate the load on NameNode, you can reduce the number of virtual segment used per node. You can do this on the statement-level or on the resource queue level. See [Configuring the Maximum Number of Virtual Segments](../resourcemgmt/ConfigureResourceManagement.html#topic_tl5_wq1_f5) for more information.
+-   Use resource queues to limit load query and read query concurrency.
+
+The best practice for loading data into partitioned tables is to create an intermediate staging table, load it, and then exchange it into your partition design. See [Exchanging a Partition](../ddl/ddl-partition.html#topic83).
+
+## <a id="topic_s23_52j_1v"></a>Best Practices for Partitioning Data
+
+### <a id="topic65"></a>Deciding on a Table Partitioning Strategy
+
+Not all tables are good candidates for partitioning. If the answer is *yes* to all or most of the following questions, table partitioning is a viable database design strategy for improving query performance. If the answer is *no* to most of the following questions, table partitioning is not the right solution for that table. Test your design strategy to ensure that query performance improves as expected.
+
+-   **Is the table large enough?** Large fact tables are good candidates for table partitioning. If you have millions or billions of records in a table, you may see performance benefits from logically breaking that data up into smaller chunks. For smaller tables with only a few thousand rows or less, the administrative overhead of maintaining the partitions will outweigh any performance benefits you might see.
+-   **Are you experiencing unsatisfactory performance?** As with any performance tuning initiative, a table should be partitioned only if queries against that table are producing slower response times than desired.
+-   **Do your query predicates have identifiable access patterns?** Examine the `WHERE` clauses of your query workload and look for table columns that are consistently used to access data. For example, if most of your queries tend to look up records by date, then a monthly or weekly date-partitioning design might be beneficial. Or if you tend to access records by region, consider a list-partitioning design to divide the table by region.
+-   **Does your data warehouse maintain a window of historical data?** Another consideration for partition design is your organization's business requirements for maintaining historical data. For example, your data warehouse may require that you keep data for the past twelve months. If the data is partitioned by month, you can easily drop the oldest monthly partition from the warehouse and load current data into the most recent monthly partition.
+-   **Can the data be divided into somewhat equal parts based on some defining criteria?** Choose partitioning criteria that will divide your data as evenly as possible. If the partitions contain a relatively equal number of records, query performance improves based on the number of partitions created. For example, by dividing a large table into 10 partitions, a query will execute 10 times faster than it would against the unpartitioned table, provided that the partitions are designed to support the query's criteria.
+
+Do not create more partitions than are needed. Creating too many partitions can slow down management and maintenance jobs, such as vacuuming, recovering segments, expanding the cluster, checking disk usage, and others.
+
+Partitioning does not improve query performance unless the query optimizer can eliminate partitions based on the query predicates. Queries that scan every partition run slower than if the table were not partitioned, so avoid partitioning if few of your queries achieve partition elimination. Check the explain plan for queries to make sure that partitions are eliminated. See [Query Profiling](../query/query-profiling.html#topic39) for more about partition elimination.
+
+Be very careful with multi-level partitioning because the number of partition files can grow very quickly. For example, if a table is partitioned by both day and city, and there are 1,000 days of data and 1,000 cities, the total number of partitions is one million. Column-oriented tables store each column in a physical table, so if this table has 100 columns, the system would be required to manage 100 million files for the table.
+
+Before settling on a multi-level partitioning strategy, consider a single level partition with bitmap indexes. Indexes slow down data loads, so consider performance testing with your data and schema to decide on the best strategy.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/bestpractices/managing_resources_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/managing_resources_bestpractices.html.md.erb b/bestpractices/managing_resources_bestpractices.html.md.erb
new file mode 100644
index 0000000..f770611
--- /dev/null
+++ b/bestpractices/managing_resources_bestpractices.html.md.erb
@@ -0,0 +1,144 @@
+---
+title: Best Practices for Managing Resources
+---
+
+This topic describes best practices for managing resources in HAWQ.
+
+## <a id="topic_ikz_ndx_15"></a>Best Practices for Configuring Resource Management
+
+When configuring resource management, you can apply certain best practices to ensure that resources are managed both efficiently and for best system performance.
+
+The following is a list of high-level best practices for optimal resource management:
+
+-   Make sure segments do not have identical IP addresses. See [Segments Do Not Appear in gp\_segment\_configuration](../troubleshooting/Troubleshooting.html#topic_hlj_zxx_15) for an explanation of this problem.
+-   Configure all segments to have the same resource capacity. See [Configuring Segment Resource Capacity](../resourcemgmt/ConfigureResourceManagement.html#topic_htk_fxh_15).
+-   To prevent resource fragmentation, ensure that your deployment's segment resource capacity (standalone mode) or YARN node resource capacity (YARN mode) is a multiple of all virtual segment resource quotas. See [Configuring Segment Resource Capacity](../resourcemgmt/ConfigureResourceManagement.html#topic_htk_fxh_15) (HAWQ standalone mode) and [Setting HAWQ Segment Resource Capacity in YARN](../resourcemgmt/YARNIntegration.html#topic_pzf_kqn_c5).
+-   Ensure that enough registered segments are available and usable for query resource requests. If the number of unavailable or unregistered segments is higher than a set limit, then query resource requests are rejected. Also ensure that the variance of dispatched virtual segments across physical segments is not greater than the configured limit. See [Rejection of Query Resource Requests](../troubleshooting/Troubleshooting.html#topic_vm5_znx_15).
+-   Use multiple master and segment temporary directories on separate, large disks (2TB or greater) to load balance writes to temporary files (for example, `/disk1/tmp             /disk2/tmp`). For a given query, HAWQ will use a separate temp directory (if available) for each virtual segment to store spill files. Multiple HAWQ sessions will also use separate temp directories where available to avoid disk contention. If you configure too few temp directories, or you place multiple temp directories on the same disk, you increase the risk of disk contention or running out of disk space when multiple virtual segments target the same disk.
+-   Configure minimum resource levels in YARN, and tune the timeout of when idle resources are returned to YARN. See [Tune HAWQ Resource Negotiations with YARN](../resourcemgmt/YARNIntegration.html#topic_wp3_4bx_15).
+-   Make sure that the property `yarn.scheduler.minimum-allocation-mb` in `yarn-site.xml` is an equal subdivision of 1GB. For example, 1024, 512.
+
+## <a id="topic_hvd_pls_wv"></a>Best Practices for Using Resource Queues
+
+Design and configure your resource queues depending on the operational needs of your deployment. This topic describes the best practices for creating and modifying resource queues within the context of different operational scenarios.
+
+### Modifying Resource Queues for Overloaded HDFS
+
+A high number of concurrent HAWQ queries can cause HDFS to overload, especially when querying partitioned tables. Use the `ACTIVE_STATEMENTS` attribute to restrict statement concurrency in a resource queue. For example, if an external application is executing more than 100 concurrent queries, then limiting the number of active statements in your resource queues will instruct the HAWQ resource manager to restrict actual statement concurrency within HAWQ. You might want to modify an existing resource queue as follows:
+
+```sql
+ALTER RESOURCE QUEUE sampleque1 WITH (ACTIVE_STATEMENTS=20);
+```
+
+In this case, when this DDL is applied to queue `sampleque1`, the roles using this queue will have to wait until no more than 20 statements are running to execute their queries. Therefore, 80 queries will be waiting in the queue for later execution. Restricting the number of active query statements helps limit the usage of HDFS resources and protects HDFS. You can alter concurrency even when the resource queue is busy. For example, if a queue already has 40 concurrent statements running, and you apply a DDL statement that specifies `ACTIVE_STATEMENTS=20`, then the resource queue pauses the allocation of resources to queries until more than 20 statements have returned their resources.
+
+### Isolating and Protecting Production Workloads
+
+Another best practice is using resource queues to isolate your workloads. Workload isolation prevents your production workload from being starved of resources. To create this isolation, divide your workload by creating roles for specific purposes. For example, you could create one role for production online verification and another role for the regular running of production processes.
+
+In this scenario, let us assign `role1` for the production workload and `role2` for production software verification. We can define the following resource queues under the same parent queue `dept1que`, which is the resource queue defined for the entire department.
+
+```sql
+CREATE RESOURCE QUEUE dept1product
+   WITH (PARENT='dept1que', MEMORY_LIMIT_CLUSTER=90%, CORE_LIMIT_CLUSTER=90%, RESOURCE_OVERCOMMIT_FACTOR=2);
+
+CREATE RESOURCE QUEUE dept1verification 
+   WITH (PARENT='dept1que', MEMORY_LIMIT_CLUSTER=10%, CORE_LIMIT_CLUSTER=10%, RESOURCE_OVERCOMMIT_FACTOR=10);
+
+ALTER ROLE role1 RESOURCE QUEUE dept1product;
+
+ALTER ROLE role2 RESOURCE QUEUE dept1verification;
+```
+
+With these resource queues defined, workload is spread across the resource queues as follows:
+
+-   When both `role1` and `role2` have workloads, the test verification workload gets only 10% of the total available `dept1que` resources, leaving 90% of the `dept1que` resources available for running the production workload.
+-   When `role1` has a workload but `role2` is idle, then 100% of all `dept1que` resources can be consumed by the production workload.
+-   When only `role2` has a workload (for example, during a scheduled testing window), then 100% of all `dept1que` resources can also be utilized for testing.
+
+Even when the resource queues are busy, you can alter the resource queue's memory and core limits to change resource allocation policies before switching workloads.
+
+In addition, you can use resource queues to isolate workloads for different departments or different applications. For example, we can use the following DDL statements to define 3 departments, and an administrator can arbitrarily redistribute resource allocations among the departments according to usage requirements.
+
+```sql
+ALTER RESOURCE QUEUE pg_default 
+   WITH (MEMORY_LIMIT_CLUSTER=10%, CORE_LIMIT_CLUSTER=10%);
+
+CREATE RESOURCE QUEUE dept1 
+   WITH (PARENT='pg_root', MEMORY_LIMIT_CLUSTER=30%, CORE_LIMIT_CLUSTER=30%);
+
+CREATE RESOURCE QUEUE dept2 
+   WITH (PARENT='pg_root', MEMORY_LIMIT_CLUSTER=30%, CORE_LIMIT_CLUSTER=30%);
+
+CREATE RESOURCE QUEUE dept3 
+   WITH (PARENT='pg_root', MEMORY_LIMIT_CLUSTER=30%, CORE_LIMIT_CLUSTER=30%);
+
+CREATE RESOURCE QUEUE dept11
+   WITH (PARENT='dept1', MEMORY_LIMIT_CLUSTER=50%,CORE_LIMIT_CLUSTER=50%);
+
+CREATE RESOURCE QUEUE dept12
+   WITH (PARENT='dept1', MEMORY_LIMIT_CLUSTER=50%, CORE_LIMIT_CLUSTER=50%);
+```
+
+### Querying Parquet Tables with Large Table Size
+
+You can use resource queues to improve query performance on Parquet tables with a large page size. This type of query requires a large memory quota for virtual segments. Therefore, if one role mostly queries Parquet tables with a large page size, alter the resource queue associated with the role to increase its virtual segment resource quota. For example:
+
+```sql
+ALTER RESOURCE queue1 WITH (VSEG_RESOURCE_QUOTA='mem:2gb');
+```
+
+If there are only occasional queries on Parquet tables with a large page size, use a statement level specification instead of altering the resource queue. For example:
+
+```sql
+SET HAWQ_RM_STMT_NVSEG=10;
+SET HAWQ_RM_STMT_VSEG_MEMORY='2gb';
+query1;
+SET HAWQ_RM_STMT_NVSEG=0;
+```
+
+### Restricting Resource Consumption for Specific Queries
+
+In general, the HAWQ resource manager attempts to provide as much resources as possible to the current query to achieve high query performance. When a query is complex and large, however, the associated resource queue can use up many virtual segments causing other resource queues (and queries) to starve. Under these circumstances,you should enable nvseg limits on the resource queue associated with the large query. For example, you can specify that all queries can use no more than 200 virtual segments. To achieve this limit, alter the resource queue as follows
+
+``` sql
+ALTER RESOURCE QUEUE queue1 WITH (NVSEG_UPPER_LIMIT=200);
+```
+
+If we hope to make this limit vary according to the dynamic cluster size, we can use the following statement.
+
+```sql
+ALTER RESOURCE QUEUE queue1 WITH (NVSEG_UPPER_LIMIT_PERSEG=10);
+```
+
+After setting the limit in the above example, the actual limit will be 100 if you have a 10-node cluster. If the cluster is expanded to 20 nodes, then the limit increases automatically to 200.
+
+### Guaranteeing Resource Allocations for Individual Statements
+
+In general, the minimum number of virtual segments allocated to a statement is decided by the resource queue's actual capacity and its concurrency setting. For example, if there are 10 nodes in a cluster and the total resource capacity of the cluster is 640GB and 160 cores, then a resource queue having 20% capacity has a capacity of 128GB (640GB \* .20) and 32 cores (160 \*.20). If the virtual segment quota is set to 256MB, then this queue has 512 virtual segments allocated (128GB/256MB=512). If the `ACTIVE_STATEMENTS` concurrency setting for the resource queue is 20, then the minimum number of allocated virtual segments for each query is **25** (*trunc*(512/20)=25). However, this minimum number of virtual segments is a soft restriction. If a query statement requires only 5 virtual segments, then this minimum number of 25 is ignored since it is not necessary to allocate 25 for this statement.
+
+In order to raise the minimum number of virtual segments available for a query statement, there are two options.
+
+-   *Option 1*: Alter the resource queue to reduce concurrency. This is the recommended way to achieve the goal. For example:
+
+    ```sql
+    ALTER RESOURCE QUEUE queue1 WITH (ACTIVE_STATEMENTS=10);
+    ```
+
+    If the original concurrency setting is 20, then the minimum number of virtual segments is doubled.
+
+-   *Option 2*: Alter the nvseg limits of the resource queue. For example:
+
+    ```sql
+    ALTER RESOURCE QUEUE queue1 WITH (NVSEG_LOWER_LIMIT=50);
+    ```
+
+    or, alternately:
+
+    ```sql
+    ALTER RESOURCE QUEUE queue1 WITH (NVSEG_LOWER_LIMIT_PERSEG=5);
+    ```
+
+    In the second DDL, if there are 10 nodes in the cluster, the actual minimum number of virtual segments is 50 (5 \* 10 = 50).
+
+

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/bestpractices/operating_hawq_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/operating_hawq_bestpractices.html.md.erb b/bestpractices/operating_hawq_bestpractices.html.md.erb
new file mode 100644
index 0000000..d48cf82
--- /dev/null
+++ b/bestpractices/operating_hawq_bestpractices.html.md.erb
@@ -0,0 +1,289 @@
+---
+title: Best Practices for Operating HAWQ
+---
+
+This topic provides best practices for operating HAWQ, including recommendations for stopping, starting and monitoring HAWQ.
+
+## <a id="task_qgk_bz3_1v"></a>Best Practices to Start/Stop HAWQ Cluster Members
+
+For best results in using `hawq start` and `hawq stop` to manage your HAWQ system, the following best practices are recommended.
+
+-   Issue the `CHECKPOINT` command to update and flush all data files to disk and update the log file before stopping the cluster. A checkpoint ensures that, in the event of a crash, files can be restored from the checkpoint snapshot.
+-   Stop the entire HAWQ system by stopping the cluster on the master host:
+    ```shell
+    $ hawq stop cluster
+    ```
+
+-   To stop segments and kill any running queries without causing data loss or inconsistency issues, use `fast` or `immediate` mode on the cluster:
+
+    ```shell
+    $ hawq stop cluster -M fast
+    ```
+    ```shell
+    $ hawq stop cluster -M immediate
+    ```
+
+-   Use `hawq stop master` to stop the master only. If you cannot stop the master due to running transactions, try using fast shutdown. If fast shutdown does not work, use immediate shutdown. Use immediate shutdown with caution, as it will result in a crash-recovery run when the system is restarted. 
+
+    ```shell
+    $ hawq stop master -M fast
+    ```
+    ```shell
+    $ hawq stop master -M immediate
+    ```
+
+-   When stopping a segment or all segments, you can use the default mode of smart mode. Using fast or immediate mode on segments will have no effect since segments are stateless.
+
+    ```shell
+    $ hawq stop segment
+    ```
+    ```shell
+    $ hawq stop allsegments
+    ```
+
+-   Typically you should always use `hawq start cluster` or `hawq               restart cluster` to start the cluster. If you do end up using `hawq start standby|master|segment` to start nodes individually, make sure you always start the standby before the active master. Otherwise, the standby can become unsynchronized with the active master.
+
+## <a id="id_trr_m1j_1v"></a>Guidelines for Cluster Expansion
+
+This topic provides some guidelines around expanding your HAWQ cluster.
+
+There are several recommendations to keep in mind when modifying the size of your running HAWQ cluster:
+
+-   When you add a new node, install both a DataNode and a physical segment on the new node.
+-   After adding a new node, you should always rebalance HDFS data to maintain cluster performance.
+-   Adding or removing a node also necessitates an update to the HDFS metadata cache. This update will happen eventually, but can take some time. To speed the update of the metadata cache, execute **`select gp_metadata_cache_clear();`**.
+-   Note that for hash distributed tables, expanding the cluster will not immediately improve performance since hash distributed tables use a fixed number of virtual segments. In order to obtain better performance with hash distributed tables, you must redistribute the table to the updated cluster by either the [ALTER TABLE](../reference/sql/ALTER-TABLE.html) or [CREATE TABLE AS](../reference/sql/CREATE-TABLE-AS.html#topic1) command.
+-   If you are using hash tables, consider updating the `default_hash_table_bucket_number` server configuration parameter to a larger value after expanding the cluster but before redistributing the hash tables.
+
+## <a id="id_o5n_p1j_1v"></a>Database State Monitoring Activities
+
+<a id="id_o5n_p1j_1v__d112e31"></a>
+
+<table>
+<caption><span class="tablecap">Table 1. Database State Monitoring Activities</span></caption>
+<colgroup>
+<col width="33%" />
+<col width="33%" />
+<col width="33%" />
+</colgroup>
+<thead>
+<tr class="header">
+<th>Activity</th>
+<th>Procedure</th>
+<th>Corrective Actions</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td>List segments that are currently down. If any rows are returned, this should generate a warning or alert.
+<p>Recommended frequency: run every 5 to 10 minutes</p>
+<p>Severity: IMPORTANT</p></td>
+<td>Run the following query in the <code class="ph codeph">postgres</code> database:
+<pre class="pre codeblock"><code>SELECT * FROM gp_segment_configuration
+WHERE status &lt;&gt; &#39;u&#39;;</code></pre></td>
+<td>If the query returns any rows, follow these steps to correct the problem:
+<ol>
+<li>Verify that the hosts with down segments are responsive.</li>
+<li>If hosts are OK, check the <span class="ph filepath">pg_log</span> files for the down segments to discover the root cause of the segments going down.</li>
+<li>If no unexpected errors are found, run the <code class="ph codeph">gprecoverseg</code> utility to bring the segments back online.</li>
+</ol></td>
+</tr>
+</tbody>
+</table>
+
+
+## <a id="id_d3w_p1j_1v"></a>Hardware and Operating System Monitoring
+
+<a id="id_d3w_p1j_1v__d112e111"></a>
+
+<table>
+<caption><span class="tablecap">Table 2. Hardware and Operating System Monitoring Activities</span></caption>
+<colgroup>
+<col width="33%" />
+<col width="33%" />
+<col width="33%" />
+</colgroup>
+<thead>
+<tr class="header">
+<th>Activity</th>
+<th>Procedure</th>
+<th>Corrective Actions</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td>Underlying platform check for maintenance required or system down of the hardware.
+<p>Recommended frequency: real-time, if possible, or every 15 minutes</p>
+<p>Severity: CRITICAL</p></td>
+<td>Set up system check for hardware and OS errors.</td>
+<td>If required, remove a machine from the HAWQ cluster to resolve hardware and OS issues, then, after add it back to the cluster and run <code class="ph codeph">gprecoverseg</code>.</td>
+</tr>
+<tr class="even">
+<td>Check disk space usage on volumes used for HAWQ data storage and the OS.
+<p>Recommended frequency: every 5 to 30 minutes</p>
+<p>Severity: CRITICAL</p></td>
+<td><div class="p">
+Set up a disk space check.
+<ul>
+<li>Set a threshold to raise an alert when a disk reaches a percentage of capacity. The recommended threshold is 75% full.</li>
+<li>It is not recommended to run the system with capacities approaching 100%.</li>
+</ul>
+</div></td>
+<td>Free space on the system by removing some data or files.</td>
+</tr>
+<tr class="odd">
+<td>Check for errors or dropped packets on the network interfaces.
+<p>Recommended frequency: hourly</p>
+<p>Severity: IMPORTANT</p></td>
+<td>Set up a network interface checks.</td>
+<td><p>Work with network and OS teams to resolve errors.</p></td>
+</tr>
+<tr class="even">
+<td>Check for RAID errors or degraded RAID performance.
+<p>Recommended frequency: every 5 minutes</p>
+<p>Severity: CRITICAL</p></td>
+<td>Set up a RAID check.</td>
+<td><ul>
+<li>Replace failed disks as soon as possible.</li>
+<li>Work with system administration team to resolve other RAID or controller errors as soon as possible.</li>
+</ul></td>
+</tr>
+<tr class="odd">
+<td>Check for adequate I/O bandwidth and I/O skew.
+<p>Recommended frequency: when create a cluster or when hardware issues are suspected.</p></td>
+<td>Run the HAWQ <code class="ph codeph">hawq checkperf</code> utility.</td>
+<td><div class="p">
+The cluster may be under-specified if data transfer rates are not similar to the following:
+<ul>
+<li>2GB per second disk read</li>
+<li>1 GB per second disk write</li>
+<li>10 Gigabit per second network read and write</li>
+</ul>
+If transfer rates are lower than expected, consult with your data architect regarding performance expectations.
+</div>
+<p>If the machines on the cluster display an uneven performance profile, work with the system administration team to fix faulty machines.</p></td>
+</tr>
+</tbody>
+</table>
+
+
+## <a id="id_khd_q1j_1v"></a>Data Maintenance
+
+<a id="id_khd_q1j_1v__d112e279"></a>
+
+<table>
+<caption><span class="tablecap">Table 3. Data Maintenance Activities</span></caption>
+<colgroup>
+<col width="33%" />
+<col width="33%" />
+<col width="33%" />
+</colgroup>
+<thead>
+<tr class="header">
+<th>Activity</th>
+<th>Procedure</th>
+<th>Corrective Actions</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td>Check for missing statistics on tables.</td>
+<td>Check the <code class="ph codeph">hawq_stats_missing</code> view in each database:
+<pre class="pre codeblock"><code>SELECT * FROM hawq_toolkit.hawq_stats_missing;</code></pre></td>
+<td>Run <code class="ph codeph">ANALYZE</code> on tables that are missing statistics.</td>
+</tr>
+</tbody>
+</table>
+
+
+## <a id="id_lx4_q1j_1v"></a>Database Maintenance
+
+<a id="id_lx4_q1j_1v__d112e343"></a>
+
+<table>
+<caption><span class="tablecap">Table 4. Database Maintenance Activities</span></caption>
+<colgroup>
+<col width="33%" />
+<col width="33%" />
+<col width="33%" />
+</colgroup>
+<thead>
+<tr class="header">
+<th>Activity</th>
+<th>Procedure</th>
+<th>Corrective Actions</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td>Mark deleted rows in HAWQ system catalogs (tables in the <code class="ph codeph">pg_catalog</code> schema) so that the space they occupy can be reused.
+<p>Recommended frequency: daily</p>
+<p>Severity: CRITICAL</p></td>
+<td>Vacuum each system catalog:
+<pre class="pre codeblock"><code>VACUUM &lt;table&gt;;</code></pre></td>
+<td>Vacuum system catalogs regularly to prevent bloating.</td>
+</tr>
+<tr class="even">
+<td>Update table statistics.
+<p>Recommended frequency: after loading data and before executing queries</p>
+<p>Severity: CRITICAL</p></td>
+<td>Analyze user tables:
+<pre class="pre codeblock"><code>ANALYZEDB -d &lt;database&gt; -a</code></pre></td>
+<td>Analyze updated tables regularly so that the optimizer can produce efficient query execution plans.</td>
+</tr>
+<tr class="odd">
+<td>Backup the database data.
+<p>Recommended frequency: daily, or as required by your backup plan</p>
+<p>Severity: CRITICAL</p></td>
+<td>See <a href="../admin/BackingUpandRestoringHAWQDatabases.html">Backing up and Restoring HAWQ Databases</a> for a discussion of backup procedures</td>
+<td>Best practice is to have a current backup ready in case the database must be restored.</td>
+</tr>
+<tr class="even">
+<td>Reindex system catalogs (tables in the <code class="ph codeph">pg_catalog</code> schema) to maintain an efficient catalog.
+<p>Recommended frequency: weekly, or more often if database objects are created and dropped frequently</p></td>
+<td>Run <code class="ph codeph">REINDEX SYSTEM</code> in each database.
+<pre class="pre codeblock"><code>REINDEXDB -s</code></pre></td>
+<td>The optimizer retrieves information from the system tables to create query plans. If system tables and indexes are allowed to become bloated over time, scanning the system tables increases query execution time.</td>
+</tr>
+</tbody>
+</table>
+
+
+## <a id="id_blv_q1j_1v"></a>Patching and Upgrading
+
+<a id="id_blv_q1j_1v__d112e472"></a>
+
+<table>
+<caption><span class="tablecap">Table 5. Patch and Upgrade Activities</span></caption>
+<colgroup>
+<col width="33%" />
+<col width="33%" />
+<col width="33%" />
+</colgroup>
+<thead>
+<tr class="header">
+<th>Activity</th>
+<th>Procedure</th>
+<th>Corrective Actions</th>
+</tr>
+</thead>
+<tbody>
+<tr class="odd">
+<td>Ensure any bug fixes or enhancements are applied to the kernel.
+<p>Recommended frequency: at least every 6 months</p>
+<p>Severity: IMPORTANT</p></td>
+<td>Follow the vendor's instructions to update the Linux kernel.</td>
+<td>Keep the kernel current to include bug fixes and security fixes, and to avoid difficult future upgrades.</td>
+</tr>
+<tr class="even">
+<td>Install HAWQ minor releases.
+<p>Recommended frequency: quarterly</p>
+<p>Severity: IMPORTANT</p></td>
+<td>Always upgrade to the latest in the series.</td>
+<td>Keep the HAWQ software current to incorporate bug fixes, performance enhancements, and feature enhancements into your HAWQ cluster.</td>
+</tr>
+</tbody>
+</table>
+
+
+

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/bestpractices/querying_data_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/querying_data_bestpractices.html.md.erb b/bestpractices/querying_data_bestpractices.html.md.erb
new file mode 100644
index 0000000..8a76c6d
--- /dev/null
+++ b/bestpractices/querying_data_bestpractices.html.md.erb
@@ -0,0 +1,25 @@
+---
+title: Best Practices for Querying Data
+---
+
+To obtain the best results when querying data in HAWQ, review the best practices described in this topic.
+
+## <a id="id_xtk_jmq_1v"></a>Examining Query Plans to Solve Problems
+
+If a query performs poorly, examine its query plan and ask the following questions:
+
+-   **Do operations in the plan take an exceptionally long time?** Look for an operation that consumes the majority of query processing time. For example, if a scan on a hash table takes longer than expected, the data locality may be low; reloading the data can increase the data locality and speed up the query. Or, adjust `enable_<operator>` parameters to see if you can force the legacy query optimizer (planner) to choose a different plan by disabling a particular query plan operator for that query.
+-   **Are the optimizer's estimates close to reality?** Run `EXPLAIN             ANALYZE` and see if the number of rows the optimizer estimates is close to the number of rows the query operation actually returns. If there is a large discrepancy, collect more statistics on the relevant columns.
+-   **Are selective predicates applied early in the plan?** Apply the most selective filters early in the plan so fewer rows move up the plan tree. If the query plan does not correctly estimate query predicate selectivity, collect more statistics on the relevant columns. You can also try reordering the `WHERE` clause of your SQL statement.
+-   **Does the optimizer choose the best join order?** When you have a query that joins multiple tables, make sure that the optimizer chooses the most selective join order. Joins that eliminate the largest number of rows should be done earlier in the plan so fewer rows move up the plan tree.
+
+    If the plan is not choosing the optimal join order, set `join_collapse_limit=1` and use explicit `JOIN` syntax in your SQL statement to force the legacy query optimizer (planner) to the specified join order. You can also collect more statistics on the relevant join columns.
+
+-   **Does the optimizer selectively scan partitioned tables?** If you use table partitioning, is the optimizer selectively scanning only the child tables required to satisfy the query predicates? Scans of the parent tables should return 0 rows since the parent tables do not contain any data. See [Verifying Your Partition Strategy](../ddl/ddl-partition.html#topic74) for an example of a query plan that shows a selective partition scan.
+-   **Does the optimizer choose hash aggregate and hash join operations where applicable?** Hash operations are typically much faster than other types of joins or aggregations. Row comparison and sorting is done in memory rather than reading/writing from disk. To enable the query optimizer to choose hash operations, there must be sufficient memory available to hold the estimated number of rows. Try increasing work memory to improve performance for a query. If possible, run an `EXPLAIN             ANALYZE` for the query to show which plan operations spilled to disk, how much work memory they used, and how much memory was required to avoid spilling to disk. For example:
+
+    `Work_mem used: 23430K bytes avg, 23430K bytes max (seg0). Work_mem               wanted: 33649K bytes avg, 33649K bytes max (seg0) to lessen workfile I/O affecting 2               workers.`
+
+    The "bytes wanted" message from `EXPLAIN               ANALYZE` is based on the amount of data written to work files and is not exact. The minimum `work_mem` needed can differ from the suggested value.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/bestpractices/secure_bestpractices.html.md.erb
----------------------------------------------------------------------
diff --git a/bestpractices/secure_bestpractices.html.md.erb b/bestpractices/secure_bestpractices.html.md.erb
new file mode 100644
index 0000000..04c5343
--- /dev/null
+++ b/bestpractices/secure_bestpractices.html.md.erb
@@ -0,0 +1,11 @@
+---
+title: Best Practices for Securing HAWQ
+---
+
+To secure your HAWQ deployment, review the recommendations listed in this topic.
+
+-   Set up SSL to encrypt your client server communication channel. See [Encrypting Client/Server Connections](../clientaccess/client_auth.html#topic5).
+-   Configure `pg_hba.conf` only on HAWQ master. Do not configure it on segments.
+    **Note:** For a more secure system, consider removing all connections that use trust authentication from your master `pg_hba.conf`. Trust authentication means the role is granted access without any authentication, therefore bypassing all security. Replace trust entries with ident authentication if your system has an ident service available.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/7514e193/clientaccess/client_auth.html.md.erb
----------------------------------------------------------------------
diff --git a/clientaccess/client_auth.html.md.erb b/clientaccess/client_auth.html.md.erb
new file mode 100644
index 0000000..1c06d42
--- /dev/null
+++ b/clientaccess/client_auth.html.md.erb
@@ -0,0 +1,193 @@
+---
+title: Configuring Client Authentication
+---
+
+When a HAWQ system is first initialized, the system contains one predefined *superuser* role. This role will have the same name as the operating system user who initialized the HAWQ system. This role is referred to as `gpadmin`. By default, the system is configured to only allow local connections to the database from the `gpadmin` role. To allow any other roles to connect, or to allow connections from remote hosts, you configure HAWQ to allow such connections.
+
+## <a id="topic2"></a>Allowing Connections to HAWQ 
+
+Client access and authentication is controlled by the standard PostgreSQL host-based authentication file, pg\_hba.conf. In HAWQ, the pg\_hba.conf file of the master instance controls client access and authentication to your HAWQ system. HAWQ segments have pg\_hba.conf files that are configured to allow only client connections from the master host and never accept client connections. Do not alter the pg\_hba.conf file on your segments.
+
+See [The pg\_hba.conf File](http://www.postgresql.org/docs/9.0/interactive/auth-pg-hba-conf.html) in the PostgreSQL documentation for more information.
+
+The general format of the pg\_hba.conf file is a set of records, one per line. HAWQ ignores blank lines and any text after the `#` comment character. A record consists of a number of fields that are separated by spaces and/or tabs. Fields can contain white space if the field value is quoted. Records cannot be continued across lines. Each remote client access record has the following format:
+
+```
+*host*   *database*   *role*   *CIDR-address*   *authentication-method*
+```
+
+Each UNIX-domain socket access record has the following format:
+
+```
+*local*   *database*   *role*   *authentication-method*
+```
+
+The following table describes meaning of each field.
+
+|Field|Description|
+|-----|-----------|
+|local|Matches connection attempts using UNIX-domain sockets. Without a record of this type, UNIX-domain socket connections are disallowed.|
+|host|Matches connection attempts made using TCP/IP. Remote TCP/IP connections will not be possible unless the server is started with an appropriate value for the listen\_addresses server configuration parameter.|
+|hostssl|Matches connection attempts made using TCP/IP, but only when the connection is made with SSL encryption. SSL must be enabled at server start time by setting the ssl configuration parameter|
+|hostnossl|Matches connection attempts made over TCP/IP that do not use SSL.|
+|database|Specifies which database names this record matches. The value `all` specifies that it matches all databases. Multiple database names can be supplied by separating them with commas. A separate file containing database names can be specified by preceding the file name with @.|
+|role|Specifies which database role names this record matches. The value `all` specifies that it matches all roles. If the specified role is a group and you want all members of that group to be included, precede the role name with a +. Multiple role names can be supplied by separating them with commas. A separate file containing role names can be specified by preceding the file name with @.|
+|CIDR-address|Specifies the client machine IP address range that this record matches. It contains an IP address in standard dotted decimal notation and a CIDR mask length. IP addresses can only be specified numerically, not as domain or host names. The mask length indicates the number of high-order bits of the client IP address that must match. Bits to the right of this must be zero in the given IP address. There must not be any white space between the IP address, the /, and the CIDR mask length. Typical examples of a CIDR-address are 192.0.2.0/32 for a single host, or 192.0.2.2/24 for a small network, or 192.0.2.3/16 for a larger one. To specify a single host, use a CIDR mask of 32 for IPv4 or 128 for IPv6. In a network address, do not omit trailing zeroes.|
+|IP-address, IP-mask|These fields can be used as an alternative to the CIDR-address notation. Instead of specifying the mask length, the actual mask is specified in a separate column. For example, 255.255.255.255 represents a CIDR mask length of 32. These fields only apply to host, hostssl, and hostnossl records.|
+|authentication-method|Specifies the authentication method to use when connecting. HAWQ supports the [authentication methods](http://www.postgresql.org/docs/9.0/static/auth-methods.html) supported by PostgreSQL 9.0.|
+
+### <a id="topic3"></a>Editing the pg\_hba.conf File 
+
+This example shows how to edit the pg\_hba.conf file of the master to allow remote client access to all databases from all roles using encrypted password authentication.
+
+**Note:** For a more secure system, consider removing all connections that use trust authentication from your master pg\_hba.conf. Trust authentication means the role is granted access without any authentication, therefore bypassing all security. Replace trust entries with ident authentication if your system has an ident service available.
+
+#### <a id="ip144328"></a>Editing pg\_hba.conf 
+
+1.  Open the file $MASTER\_DATA\_DIRECTORY/pg\_hba.conf in a text editor.
+2.  Add a line to the file for each type of connection you want to allow. Records are read sequentially, so the order of the records is significant. Typically, earlier records will have tight connection match parameters and weaker authentication methods, while later records will have looser match parameters and stronger authentication methods. For example:
+
+    ```
+    # allow the gpadmin user local access to all databases
+    # using ident authentication
+    local   all   gpadmin   ident         sameuser
+    host    all   gpadmin   127.0.0.1/32  ident
+    host    all   gpadmin   ::1/128       ident
+    # allow the 'dba' role access to any database from any
+    # host with IP address 192.168.x.x and use md5 encrypted
+    # passwords to authenticate the user
+    # Note that to use SHA-256 encryption, replace *md5* with
+    # password in the line below
+    host    all   dba   192.168.0.0/32  md5
+    # allow all roles access to any database from any
+    # host and use ldap to authenticate the user. HAWQ role
+    # names must match the LDAP common name.
+    host    all   all   192.168.0.0/32  ldap ldapserver=usldap1
+    ldapport=1389 ldapprefix="cn="
+    ldapsuffix=",ou=People,dc=company,dc=com"
+    ```
+
+3.  Save and close the file.
+4.  Reload the pg\_hba.conf configuration file for your changes to take effect:
+
+    ``` bash
+    $ hawq stop -u
+    ```
+
+
+## <a id="topic4"></a>Limiting Concurrent Connections 
+
+HAWQ allocates some resources on a per-connection basis, so setting the maximum number of connections allowed is recommended.
+
+To limit the number of active concurrent sessions to your HAWQ system, you can configure the `max_connections` server configuration parameter on master or the `seg_max_connections` server configuration parameter on segments. These parameters are *local* parameters, meaning that you must set them in the `hawq-site.xml` file of all HAWQ instances. The current recommendation is to set the value of `seg_max_connections` to 5-10 the value of `max_connections`.
+
+When you set `max_connections`, you must also set the dependent parameter `max_prepared_transactions`. This value must be at least as large as the value of `max_connections`, and all HAWQ instances should be set to the same value.
+
+For example, in `$GPHOME/etc/hawq-site.xml`:
+
+``` xml
+  <property>
+      <name>max_connections</name>
+      <value>500</value>
+  </property>
+  <property>
+      <name>max_prepared_transactions</name>
+      <value>1000</value>
+  </property>
+  <property>
+      <name>seg_max_connections</name>
+      <value>3200</value>
+  </property>
+```
+
+The following steps set the parameter values with the HAWQ utility `hawq config`.
+
+### <a id="ip142411"></a>To change the number of allowed connections 
+
+1.  Log into the HAWQ master host as the HAWQ administrator and source the file `$GPHOME/greenplum_path.sh`.
+2.  Set the value of the `max_connections` parameter. This `hawq config` command sets the value to 100 on all HAWQ instances.
+
+    ``` bash
+    $ hawq config -c max_connections -v 100
+    ```
+
+    This configuration command sets the value on the master.
+
+3.  Set the value of the `seg_max_connections` parameter. This `hawq config` command sets the value to 6400 on all HAWQ instances.
+
+    ``` bash
+    $ hawq config -c seg_max_connections -v 6400
+    ```
+
+    This configuration command sets the value on all segment instances.
+
+4.  Set the value of the `max_prepared_transactions` parameter. This `hawq config` command sets the value to 200 on the master and all segments.
+
+    ``` bash
+    $ hawq config -c max_prepared_transactions -v 200
+    ```
+
+    The value of `max_prepared_transactions` must be greater than or equal to `max_connections`.
+
+5.  Stop and restart your HAWQ system.
+
+    ``` bash
+    $ hawq restart
+    ```
+
+6.  You can check the value of parameters on the master and segments with the `hawq config``-s` option. These `hawq config` commands display the values of the `max_connections` and `seg_max_connections` parameter.
+
+    ``` bash
+    $ hawq config -s max_connections
+    $ hawq config -s seg_max_connections
+    ```
+
+
+**Note:** Raising the values of these parameters may cause HAWQ to request more shared memory. To mitigate this effect, consider decreasing other memory-related parameters such as `gp_cached_segworkers_threshold`.
+
+## <a id="topic5"></a>Encrypting Client/Server Connections 
+
+Enable SSL for client connections to HAWQ to encrypt the data passed over the network between the client and the database.
+
+HAWQ has native support for SSL connections between the client and the master server. SSL connections prevent third parties from snooping on the packets, and also prevent man-in-the-middle attacks. SSL should be used whenever the client connection goes through an insecure link, and must be used whenever client certificate authentication is used.
+
+To enable SSL requires that OpenSSL be installed on both the client and the master server systems. HAWQ can be started with SSL enabled by setting the server configuration parameter `ssl` to `on` in the master `hawq-site.xml`. When starting in SSL mode, the server will look for the files server.key \(server private key\) and server.crt \(server certificate\) in the master data directory. These files must be set up correctly before an SSL-enabled HAWQ system can start.
+
+**Important:** Do not protect the private key with a passphrase. The server does not prompt for a passphrase for the private key, and the database startup fails with an error if one is required.
+
+A self-signed certificate can be used for testing, but a certificate signed by a certificate authority \(CA\) should be used in production, so the client can verify the identity of the server. Either a global or local CA can be used. If all the clients are local to the organization, a local CA is recommended.
+
+### <a id="topic6"></a>Creating a Self-signed Certificate without a Passphrase for Testing Only 
+
+To create a quick self-signed certificate for the server for testing, use the following OpenSSL command:
+
+```
+# openssl req -new -text -out server.req
+```
+
+Enter the information requested by the prompts. Be sure to enter the local host name as *Common Name*. The challenge password can be left blank.
+
+The program will generate a key that is passphrase protected, and does not accept a passphrase that is less than four characters long.
+
+To use this certificate with HAWQ, remove the passphrase with the following commands:
+
+```
+# openssl rsa -in privkey.pem -out server.key
+# rm privkey.pem
+```
+
+Enter the old passphrase when prompted to unlock the existing key.
+
+Then, enter the following command to turn the certificate into a self-signed certificate and to copy the key and certificate to a location where the server will look for them.
+
+``` 
+# openssl req -x509 -in server.req -text -key server.key -out server.crt
+```
+
+Finally, change the permissions on the key with the following command. The server will reject the file if the permissions are less restrictive than these.
+
+```
+# chmod og-rwx server.key
+```
+
+For more details on how to create your server private key and certificate, refer to the [OpenSSL documentation](https://www.openssl.org/docs/).



Mime
View raw message