hawq-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yo...@apache.org
Subject [42/57] [abbrv] [partial] incubator-hawq-docs git commit: HAWQ-1254 Fix/remove book branching on incubator-hawq-docs
Date Tue, 10 Jan 2017 23:54:33 GMT
http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/markdown/admin/BackingUpandRestoringHAWQDatabases.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/BackingUpandRestoringHAWQDatabases.html.md.erb b/markdown/admin/BackingUpandRestoringHAWQDatabases.html.md.erb
new file mode 100644
index 0000000..78b0dec
--- /dev/null
+++ b/markdown/admin/BackingUpandRestoringHAWQDatabases.html.md.erb
@@ -0,0 +1,373 @@
+---
+title: Backing Up and Restoring HAWQ
+---
+
+This chapter provides information on backing up and restoring databases in HAWQ system.
+
+As an administrator, you will need to back up and restore your database. HAWQ provides three utilities to help you back up your data:
+
+-   `gpfdist`
+-   PXF
+-   `pg_dump`
+
+`gpfdist` and PXF are parallel loading and unloading tools that provide the best performance.  You can use `pg_dump`, a non-parallel utility inherited from PostgreSQL.
+
+In addition, in some situations you should back up your raw data from ETL processes.
+
+This section describes these three utilities, as well as raw data backup, to help you decide what fits your needs.
+
+## <a id="usinggpfdistorpxf"></a>About gpfdist and PXF 
+
+You can perform a parallel backup in HAWQ using `gpfdist` or PXF to unload all data to external tables. Backup files can reside on a local file system or HDFS. To recover tables, you can load data back from external tables to the database. 
+
+### <a id="performingaparallelbackup"></a>Performing a Parallel Backup 
+
+1.  Check the database size to ensure that the file system has enough space to save the backed up files.
+2.  Use the `pg_dump` utility to dump the schema of the target database.
+3.  Create a writable external table for each table to back up to that database.
+4.  Load table data into the newly created external tables.
+
+>    **Note:** Put the insert statements in a single transaction to prevent problems if you perform any update operations during the backup.
+
+
+### <a id="restoringfromabackup"></a>Restoring from a Backup 
+
+1.  Create a database to recover to.
+2.  Recreate the schema from the schema file \(created during the `pg_dump` process\).
+3.  Create a readable external table for each table in the database.
+4.  Load data from the external table to the actual table.
+5.  Run the `ANALYZE` command once loading is complete. This ensures that the query planner generates optimal plan based on up-to-date table statistics.
+
+### <a id="differencesbetweengpfdistandpxf"></a>Differences between gpfdist and PXF 
+
+`gpfdist` and PXF differ in the following ways:
+
+-   `gpfdist` stores backup files on local file system, while PXF stores files on HDFS.
+-   `gpfdist` only supports plain text format, while PXF also supports binary format like AVRO and customized format.
+-   `gpfdist` doesn’t support generating compressed files, while PXF supports compression \(you can specify a compression codec used in Hadoop such as `org.apache.hadoop.io.compress.GzipCodec`\).
+-   Both `gpfdist` and PXF have fast loading performance, but `gpfdist` is much faster than PXF.
+
+## <a id="usingpg_dumpandpg_restore"></a>About pg\_dump and pg\_restore 
+
+HAWQ supports the PostgreSQL backup and restore utilities, `pg_dump` and `pg_restore`. The `pg_dump` utility creates a single, large dump file in the master host containing the data from all active segments. The `pg_restore` utility restores a HAWQ database from the archive created by `pg_dump`. In most cases, this is probably not practical, as there is most likely not enough disk space in the master host for creating a single backup file of an entire distributed database. HAWQ supports these utilities in case you are migrating data from PostgreSQL to HAWQ.
+
+To create a backup archive for database `mydb`:
+
+```shell
+$ pg_dump -Ft -f mydb.tar mydb
+```
+
+To create a compressed backup using custom format and compression level 3:
+
+```shell
+$ pg_dump -Fc -Z3 -f mydb.dump mydb
+```
+
+To restore from an archive using `pg_restore`:
+
+```shell
+$ pg_restore -d new_db mydb.dump
+```
+
+## <a id="aboutbackinguprawdata"></a>About Backing Up Raw Data 
+
+Parallel backup using `gpfdist` or PXF works fine in most cases. There are a couple of situations where you cannot perform parallel backup and restore operations:
+
+-   Performing periodically incremental backups.
+-   Dumping a large data volume to external tables - this process takes a long time.
+
+In such situations, you can back up raw data generated during ETL processes and reload it into HAWQ. This provides the flexibility to choose where you store backup files.
+
+## <a id="estimatingthebestpractice"></a>Selecting a Backup Strategy/Utility 
+
+The table below summaries the differences between the four approaches we discussed above. 
+
+<table>
+  <tr>
+    <th></th>
+    <th><code>gpfdist</code></th>
+    <th>PXF</th>
+    <th><code>pg_dump</code></th>
+    <th>Raw Data Backup</th>
+  </tr>
+  <tr>
+    <td><b>Parallel</b></td>
+    <td>Yes</td>
+    <td>Yes</td>
+    <td>No</td>
+    <td>No</td>
+  </tr>
+  <tr>
+    <td><b>Incremental Backup</b></td>
+    <td>No</td>
+    <td>No</td>
+    <td>No</td>
+    <td>Yes</td>
+  </tr>
+  <tr>
+    <td><b>Backup Location</b></td>
+    <td>Local FS</td>
+    <td>HDFS</td>
+    <td>Local FS</td>
+    <td>Local FS, HDFS</td>
+  </tr>
+  <tr>
+    <td><b>Format</b></td>
+    <td>Text, CSV</td>
+    <td>Text, CSV, Custom</td>
+    <td>Text, Tar, Custom</td>
+    <td>Depends on format of row data</td>
+  </tr>
+  <tr>
+<td><b>Compression</b></td><td>No</td><td>Yes</td><td>Only support custom format</td><td>Optional</td></tr>
+<tr><td><b>Scalability</b></td><td>Good</td><td>Good</td><td>---</td><td>Good</td></tr>
+<tr><td><b>Performance</b></td><td>Fast loading, Fast unloading</td><td>Fast loading, Normal unloading</td><td>---</td><td>Fast (Just file copy)</td><tr>
+</table>
+
+## <a id="estimatingspacerequirements"></a>Estimating Space Requirements 
+
+Before you back up your database, ensure that you have enough space to store backup files. This section describes how to get the database size and estimate space requirements.
+
+-   Use `hawq_toolkit` to query size of the database you want to backup. 
+
+    ```
+    mydb=# SELECT sodddatsize FROM hawq_toolkit.hawq_size_of_database WHERE sodddatname=’mydb’;
+    ```
+
+    If tables in your database are compressed, this query shows the compressed size of the database.
+
+-   Estimate the total size of the backup files.
+    -   If your database tables and backup files are both compressed, you can use the value `sodddatsize` as an estimate value.
+    -   If your database tables are compressed  and backup files are not, you need to multiply `sodddatsize` by the compression ratio. Although this depends on the compression algorithms, you can use an empirical value such as 300%.
+    -   If your back files are compressed and database tables are not, you need to divide `sodddatsize` by the compression ratio.
+-   Get space requirement.
+    -   If you use HDFS with PXF, the space requirement is `size_of_backup_files * replication_factor`.
+
+    -   If you use gpfdist, the space requirement for each gpfdist instance is `size_of_backup_files / num_gpfdist_instances` since table data will be evenly distributed to all `gpfdist` instances.
+
+
+## <a id="usinggpfdist"></a>Using gpfdist 
+
+This section discusses `gpfdist` and shows an example of how to backup and restore HAWQ database.
+
+`gpfdist` is HAWQ’s parallel file distribution program. It is used by readable external tables and `hawq load` to serve external table files to all HAWQ segments in parallel. It is used by writable external tables to accept output streams from HAWQ segments in parallel and write them out to a file.
+
+To use `gpfdist`, start the `gpfdist` server program on the host where you want to store backup files. You can start multiple `gpfdist` instances on the same host or on different hosts. For each `gpfdist` instance, you specify a directory from which `gpfdist` will serve files for readable external tables or create output files for writable external tables. For example, if you have a dedicated machine for backup with two disks, you can start two `gpfdist` instances, each using one disk:
+
+![](../mdimages/gpfdist_instances_backup.png "Deploying multiple gpfdist instances on a backup host")
+
+You can also run `gpfdist` instances on each segment host. During backup, table data will be evenly distributed to all `gpfdist` instances specified in the `LOCATION` clause in the `CREATE EXTERNAL TABLE` definition.
+
+![](../mdimages/gpfdist_instances.png "Deploying gpfdist instances on each segment host")
+
+### <a id="example"></a>Example 
+
+This example of using `gpfdist` backs up and restores a 1TB `tpch` database. To do so, start two `gpfdist` instances on the backup host `sdw1` with two 1TB disks \(One disk mounts at `/data1`, another disk mounts at `/data2`\).
+
+#### <a id="usinggpfdisttobackupthetpchdatabase"></a>Using gpfdist to Back Up the tpch Database 
+
+1.  Create backup locations and start the `gpfdist` instances.
+
+    In this example, issuing the first command creates two folders on two different disks with the same postfix `backup/tpch_20140627`. These folders are labeled as backups of the `tpch` database on 2014-06-27. In the next two commands, the example shows two `gpfdist` instances, one using port 8080, and another using port 8081:
+
+    ```shell
+    sdw1$ mkdir -p /data1/gpadmin/backup/tpch_20140627 /data2/gpadmin/backup/tpch_20140627
+    sdw1$ gpfdist -d /data1/gpadmin/backup/tpch_20140627 -p 8080 &
+    sdw1$ gpfdist -d /data2/gpadmin/backup/tpch_20140627 -p 8081 &
+    ```
+
+2.  Save the schema for the database:
+
+    ```shell
+    master_host$ pg_dump --schema-only -f tpch.schema tpch
+    master_host$ scp tpch.schema sdw1:/data1/gpadmin/backup/tpch_20140627
+    ```
+
+    On the HAWQ master host, use the `pg_dump` utility to save the schema of the tpch database to the file tpch.schema. Copy the schema file to the backup location to restore the database schema.
+
+3.  Create a writable external table for each table in the database:
+
+    ```shell
+    master_host$ psql tpch
+    ```
+    ```sql
+    tpch=# CREATE WRITABLE EXTERNAL TABLE wext_orders (LIKE orders)
+    tpch-# LOCATION('gpfdist://sdw1:8080/orders1.csv', 'gpfdist://sdw1:8081/orders2.csv') FORMAT 'CSV';
+    tpch=# CREATE WRITABLE EXTERNAL TABLE wext_lineitem (LIKE lineitem)
+    tpch-# LOCATION('gpfdist://sdw1:8080/lineitem1.csv', 'gpfdist://sdw1:8081/lineitem2.csv') FORMAT 'CSV';
+    ```
+
+    The sample shows two tables in the `tpch` database: `orders` and `line item`. The sample shows that two corresponding external tables are created. Specify a location or each `gpfdist` instance in the `LOCATION` clause. This sample uses the CSV text format here, but you can also choose other delimited text formats. For more information, see the `CREATE EXTERNAL TABLE` SQL command.
+
+4.  Unload data to the external tables:
+
+    ```sql
+    tpch=# BEGIN;
+    tpch=# INSERT INTO wext_orders SELECT * FROM orders;
+    tpch=# INSERT INTO wext_lineitem SELECT * FROM lineitem;
+    tpch=# COMMIT;
+    ```
+
+5.  **\(Optional\)** Stop `gpfdist` servers to free ports for other processes:
+
+    Find the progress ID and kill the process:
+
+    ```shell
+    sdw1$ ps -ef | grep gpfdist
+    sdw1$ kill 612368; kill 612369
+    ```
+
+
+#### <a id="torecoverusinggpfdist"></a>Recovering Using gpfdist 
+
+1.  Restart `gpfdist` instances if they aren’t running:
+
+    ```shell
+    sdw1$ gpfdist -d /data1/gpadmin/backup/tpch_20140627 -p 8080 &
+    sdw1$ gpfdist -d /data2/gpadmin/backup/tpch_20140627 -p 8081 &
+    ```
+
+2.  Create a new database and restore the schema:
+
+    ```shell
+    master_host$ createdb tpch2
+    master_host$ scp sdw1:/data1/gpadmin/backup/tpch_20140627/tpch.schema .
+    master_host$ psql -f tpch.schema -d tpch2
+    ```
+
+3.  Create a readable external table for each table:
+
+    ```shell
+    master_host$ psql tpch2
+    ```
+    
+    ```sql
+    tpch2=# CREATE EXTERNAL TABLE rext_orders (LIKE orders) LOCATION('gpfdist://sdw1:8080/orders1.csv', 'gpfdist://sdw1:8081/orders2.csv') FORMAT 'CSV';
+    tpch2=# CREATE EXTERNAL TABLE rext_lineitem (LIKE lineitem) LOCATION('gpfdist://sdw1:8080/lineitem1.csv', 'gpfdist://sdw1:8081/lineitem2.csv') FORMAT 'CSV';
+    ```
+
+    **Note:** The location clause is the same as the writable external table above.
+
+4.  Load data back from external tables:
+
+    ```sql
+    tpch2=# INSERT INTO orders SELECT * FROM rext_orders;
+    tpch2=# INSERT INTO lineitem SELECT * FROM rext_lineitem;
+    ```
+
+5.  Run the `ANALYZE` command after data loading:
+
+    ```sql
+    tpch2=# analyze;
+    ```
+
+
+### <a id="troubleshootinggpfdist"></a>Troubleshooting gpfdist 
+
+Keep in mind that `gpfdist` is accessed at runtime by the segment instances. Therefore, you must ensure that the HAWQ segment hosts have network access to gpfdist. Since the `gpfdist` program is a  web server, to test connectivity you can run the following command from each host in your HAWQ array \(segments and master\):
+
+```shell
+$ wget http://gpfdist_hostname:port/filename
+```
+
+Also, make sure that your `CREATE EXTERNAL TABLE` definition has the correct host name, port, and file names for `gpfdist`. The file names and paths specified should be relative to the directory where gpfdist is serving files \(the directory path used when you started the `gpfdist` program\). See “Defining External Tables - Examples”.
+
+## <a id="usingpxf"></a>Using PXF 
+
+HAWQ Extension Framework \(PXF\) is an extensible framework that allows HAWQ to query external system data. The details of how to install and use PXF can be found in [Using PXF with Unmanaged Data](../pxf/HawqExtensionFrameworkPXF.html).
+
+### <a id="usingpxftobackupthetpchdatabase"></a>Using PXF to Back Up the tpch Database 
+
+1.  Create a folder on HDFS for this backup:
+
+    ```shell
+    master_host$ hdfs dfs -mkdir -p /backup/tpch-2014-06-27
+    ```
+
+2.  Dump the database schema using `pg_dump` and store the schema file in a backup folder:
+
+    ```shell
+    master_host$ pg_dump --schema-only -f tpch.schema tpch
+    master_host$ hdfs dfs -copyFromLocal tpch.schema /backup/tpch-2014-06-27
+    ```
+
+3.  Create a writable external table for each table in the database:
+
+    ```shell
+    master_host$ psql tpch
+    ```
+    
+    ```sql
+    tpch=# CREATE WRITABLE EXTERNAL TABLE wext_orders (LIKE orders)
+    tpch-# LOCATION('pxf://namenode_host:51200/backup/tpch-2014-06-27/orders'
+    tpch-#           '?Profile=HdfsTextSimple'
+    tpch-#           '&COMPRESSION_CODEC=org.apache.hadoop.io.compress.SnappyCodec'
+    tpch-#          )
+    tpch-# FORMAT 'TEXT';
+
+    tpch=# CREATE WRITABLE EXTERNAL TABLE wext_lineitem (LIKE lineitem)
+    tpch-# LOCATION('pxf://namenode_host:51200/backup/tpch-2014-06-27/lineitem'
+    tpch-#           '?Profile=HdfsTextSimple'
+    tpch-#           '&COMPRESSION_CODEC=org.apache.hadoop.io.compress.SnappyCodec')
+    tpch-# FORMAT 'TEXT';
+    ```
+
+    Here, all backup files for the `orders` table go in the /backup/tpch-2014-06-27/orders folder, all backup files for the `lineitem` table go in /backup/tpch-2014-06-27/lineitem folder. We use snappy compression to save disk space.
+
+4.  Unload the data to external tables:
+
+    ```sql
+    tpch=# BEGIN;
+    tpch=# INSERT INTO wext_orders SELECT * FROM orders;
+    tpch=# INSERT INTO wext_lineitem SELECT * FROM lineitem;
+    tpch=# COMMIT;
+    ```
+
+5.  **\(Optional\)** Change the HDFS file replication factor for the backup folder. HDFS replicates each block into three blocks by default for reliability. You can decrease this number for your backup files if you need to:
+
+    ```shell
+    master_host$ hdfs dfs -setrep 2 /backup/tpch-2014-06-27
+    ```
+
+    **Note:** This only changes the replication factor for existing files; new files will still use the default replication factor.
+
+
+### <a id="torecoverfromapxfbackup"></a>Recovering a PXF Backup 
+
+1.  Create a new database and restore the schema:
+
+    ```shell
+    master_host$ createdb tpch2
+    master_host$ hdfs dfs -copyToLocal /backup/tpch-2014-06-27/tpch.schema .
+    master_host$ psql -f tpch.schema -d tpch2
+    ```
+
+2.  Create a readable external table for each table to restore:
+
+    ```shell
+    master_host$ psql tpch2
+    ```
+    
+    ```sql
+    tpch2=# CREATE EXTERNAL TABLE rext_orders (LIKE orders)
+    tpch2-# LOCATION('pxf://namenode_host:51200/backup/tpch-2014-06-27/orders?Profile=HdfsTextSimple')
+    tpch2-# FORMAT 'TEXT';
+    tpch2=# CREATE EXTERNAL TABLE rext_lineitem (LIKE lineitem)
+    tpch2-# LOCATION('pxf://namenode_host:51200/backup/tpch-2014-06-27/lineitem?Profile=HdfsTextSimple')
+    tpch2-# FORMAT 'TEXT';
+    ```
+
+    The location clause is almost the same as above, except you don’t have to specify the `COMPRESSION_CODEC` because PXF will automatically detect it.
+
+3.  Load data back from external tables:
+
+    ```sql
+    tpch2=# INSERT INTO ORDERS SELECT * FROM rext_orders;
+    tpch2=# INSERT INTO LINEITEM SELECT * FROM rext_lineitem;
+    ```
+
+4.  Run `ANALYZE` after data loading:
+
+    ```sql
+    tpch2=# ANALYZE;
+    ```

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/markdown/admin/ClusterExpansion.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/ClusterExpansion.html.md.erb b/markdown/admin/ClusterExpansion.html.md.erb
new file mode 100644
index 0000000..d3d921b
--- /dev/null
+++ b/markdown/admin/ClusterExpansion.html.md.erb
@@ -0,0 +1,226 @@
+---
+title: Expanding a Cluster
+---
+
+Apache HAWQ supports dynamic node expansion. You can add segment nodes while HAWQ is running without having to suspend or terminate cluster operations.
+
+**Note:** This topic describes how to expand a cluster using the command-line interface. If you are using Ambari to manage your HAWQ cluster, see [Expanding the HAWQ Cluster](../admin/ambari-admin.html#amb-expand) in [Managing HAWQ Using Ambari](../admin/ambari-admin.html)
+
+## <a id="topic_kkc_tgb_h5"></a>Guidelines for Cluster Expansion 
+
+This topic provides some guidelines around expanding your HAWQ cluster.
+
+There are several recommendations to keep in mind when modifying the size of your running HAWQ cluster:
+
+-   When you add a new node, install both a DataNode and a physical segment on the new node. If you are using YARN to manage HAWQ resources, you must also configure a YARN NodeManager on the new node.
+-   After adding a new node, you should always rebalance HDFS data to maintain cluster performance.
+-   Adding or removing a node also necessitates an update to the HDFS metadata cache. This update will happen eventually, but can take some time. To speed the update of the metadata cache, execute **`select gp_metadata_cache_clear();`**.
+-   Note that for hash distributed tables, expanding the cluster will not immediately improve performance since hash distributed tables use a fixed number of virtual segments. In order to obtain better performance with hash distributed tables, you must redistribute the table to the updated cluster by either the [ALTER TABLE](../reference/sql/ALTER-TABLE.html) or [CREATE TABLE AS](../reference/sql/CREATE-TABLE-AS.html) command.
+-   If you are using hash tables, consider updating the `default_hash_table_bucket_number` server configuration parameter to a larger value after expanding the cluster but before redistributing the hash tables.
+
+## <a id="task_hawq_expand"></a>Adding a New Node to an Existing HAWQ Cluster 
+
+The following procedure describes the steps required to add a node to an existing HAWQ cluster.  First ensure that the new node has been configured per the instructions found in [Apache HAWQ System Requirements](../requirements/system-requirements.html) and [Select HAWQ Host Machines](../install/select-hosts.html).
+
+For example purposes in this procedure, we are adding a new node named `sdw4`.
+
+1.  Prepare the target machine by checking operating system configurations and passwordless ssh. HAWQ requires passwordless ssh access to all cluster nodes. To set up passwordless ssh on the new node, perform the following steps:
+    1.  Login to the master HAWQ node as gpadmin. If you are logged in as a different user, switch to the gpadmin user and source the `greenplum_path.sh` file.
+
+        ```shell
+        $ su - gpadmin
+        $ source /usr/local/hawq/greenplum_path.sh
+        ```
+
+    2.  On the HAWQ master node, change directories to /usr/local/hawq/etc. In this location, create a file called `new_hosts` and add the hostname\(s\) of the node\(s\) you wish to add to the existing HAWQ cluster, one per line. For example:
+
+        ```
+        sdw4
+        ```
+
+    3.  Login to the master HAWQ node as root and source the `greenplum_path.sh` file.
+
+        ```shell
+        $ su - root
+        $ source /usr/local/hawq/greenplum_path.sh
+        ```
+
+    4.  Execute the following hawq command to set up passwordless ssh for root on the new host machine:
+
+        ```shell
+        $ hawq ssh-exkeys -e hawq_hosts -x new_hosts
+        ```
+
+    5.  Create the gpadmin user on the new host\(s\).
+
+        ```shell
+        $ hawq ssh -f new_hosts -e '/usr/sbin/useradd gpadmin'
+        $ hawq ssh –f new_hosts -e 'echo -e "changeme\changeme" | passwd gpadmin'
+        ```
+
+    6.  Switch to the gpadmin user and source the `greenplum_path.sh` file again.
+
+        ```shell
+        $ su - gpadmin
+        $ source /usr/local/hawq/greenplum_path.sh
+        ```
+
+    7.  Execute the following hawq command a second time to set up passwordless ssh for the gpadmin user:
+
+        ```shell
+        $ hawq ssh-exkeys -e hawq_hosts -x new_hosts
+        ```
+
+    8.  (Optional) If you enabled temporary password-based authentication while preparing/configuring your new HAWQ host system, turn off password-based authentication as described in [Apache HAWQ System Requirements](../requirements/system-requirements.html#topic_pwdlessssh).
+
+    8.  After setting up passwordless ssh, you can execute the following hawq command to check the target machine's configuration.
+
+        ```shell
+        $ hawq check -f new_hosts
+        ```
+
+        Configure operating system parameters as needed on the host machine. See the HAWQ installation documentation for a list of specific operating system parameters to configure.
+
+2.  Login to the target host machine `sdw4` as the root user. If you are logged in as a different user, switch to the root account:
+
+    ```shell
+    $ su - root
+    ```
+
+3.  If not already installed, install the target machine \(`sdw4`\) as an HDFS DataNode.
+4.  If you have any user-defined function (UDF) libraries installed in your existing HAWQ cluster, install them on the new node.
+4.  Download and install HAWQ on the target machine \(`sdw4`\) as described in the [software build instructions](https://cwiki.apache.org/confluence/display/HAWQ/Build+and+Install) or in the distribution installation documentation.
+5.  On the HAWQ master node, check current cluster and host information using `psql`.
+
+    ```shell
+    $ psql -d postgres
+    ```
+    
+    ```sql
+    postgres=# SELECT * FROM gp_segment_configuration;
+    ```
+    
+    ```
+     registration_order | role | status | port  | hostname |    address    
+    --------------------+------+--------+-------+----------+---------------
+                     -1 | s    | u      |  5432 | sdw1     | 192.0.2.0
+                      0 | m    | u      |  5432 | mdw      | rhel64-1
+                      1 | p    | u      | 40000 | sdw3     | 192.0.2.2
+                      2 | p    | u      | 40000 | sdw2     | 192.0.2.1
+    (4 rows)
+    ```
+
+    At this point the new node does not appear in the cluster.
+
+6.  Execute the following command to confirm that HAWQ was installed on the new host:
+
+    ```shell
+    $ hawq ssh -f new_hosts -e "ls -l $GPHOME"
+    ```
+
+7.  On the master node, use a text editor to add hostname `sdw4` into the `hawq_hosts` file you created during HAWQ installation. \(If you do not already have this file, then you create it first and list all the nodes in your cluster.\)
+
+    ```
+    mdw
+    smdw
+    sdw1
+    sdw2
+    sdw3
+    sdw4
+    ```
+
+8.  On the master node, use a text editor to add hostname `sdw4` to the `$GPHOME/etc/slaves` file. This file lists all the segment host names for your cluster. For example:
+
+    ```
+    sdw1
+    sdw2
+    sdw3
+    sdw4
+    ```
+
+9.  Sync the `hawq-site.xml` and `slaves` configuration files to all nodes in the cluster \(as listed in hawq\_hosts\).
+
+    ```shell
+    $ hawq scp -f hawq_hosts hawq-site.xml slaves =:$GPHOME/etc/
+    ```
+
+10. Make sure that the HDFS DataNode service has started on the new node.
+11. On `sdw4`, create directories based on the values assigned to the following properties in `hawq-site.xml`. These new directories must be owned by the same database user \(for example, `gpadmin`\) who will execute the `hawq init segment` command in the next step.
+    -   `hawq_segment_directory`
+    -   `hawq_segment_temp_directory`
+    **Note:** The `hawq_segment_directory` must be empty.
+
+12. On `sdw4`, switch to the database user \(for example, `gpadmin`\), and initalize the segment.
+
+    ```shell
+    $ su - gpadmin
+    $ hawq init segment
+    ```
+
+13. On the master node, check current cluster and host information using `psql` to verify that the new `sdw4` node has initialized successfully.
+
+    ```shell
+    $ psql -d postgres
+    ```
+    
+    ```sql
+    postgres=# SELECT * FROM gp_segment_configuration ;
+    ```
+    
+    ```
+     registration_order | role | status | port  | hostname |    address    
+    --------------------+------+--------+-------+----------+---------------
+                     -1 | s    | u      |  5432 | sdw1     | 192.0.2.0
+                      0 | m    | u      |  5432 | mdw      | rhel64-1
+                      1 | p    | u      | 40000 | sdw3     | 192.0.2.2
+                      2 | p    | u      | 40000 | sdw2     | 192.0.2.1
+                      3 | p    | u      | 40000 | sdw4     | 192.0.2.3
+    (5 rows)
+    ```
+
+14. To maintain optimal cluster performance, rebalance HDFS data by running the following command:
+15. 
+    ```shell
+    $ sudo -u hdfs hdfs balancer -threshold threshold_value
+    ```
+    
+    where *threshold\_value* represents how much a DataNode's disk usage, in percentage, can differ from overall disk usage in the cluster. Adjust the threshold value according to the needs of your production data and disk. The smaller the value, the longer the rebalance time.
+>
+    **Note:** If you do not specify a threshold, then a default value of 20 is used. If the balancer detects that a DataNode is using less than a 20% difference of the cluster's overall disk usage, then data on that node will not be rebalanced. For example, if disk usage across all DataNodes in the cluster is 40% of the cluster's total disk-storage capacity, then the balancer script ensures that a DataNode's disk usage is between 20% and 60% of that DataNode's disk-storage capacity. DataNodes whose disk usage falls within that percentage range will not be rebalanced.
+
+    Rebalance time is also affected by network bandwidth. You can adjust network bandwidth used by the balancer by using the following command:
+    
+    ```shell
+    $ sudo -u hdfs hdfs dfsadmin -setBalancerBandwidth network_bandwith
+    ```
+    
+    The default value is 1MB/s. Adjust the value according to your network.
+
+15. Speed up the clearing of the metadata cache by using the following command:
+
+    ```shell
+    $ psql -d postgres
+    ```
+    
+    ```sql
+    postgres=# SELECT gp_metadata_cache_clear();
+    ```
+
+16. After expansion, if the new size of your cluster is greater than or equal \(#nodes >=4\) to 4, change the value of the `output.replace-datanode-on-failure` HDFS parameter in `hdfs-client.xml` to `false`.
+
+17. (Optional) If you are using hash tables, adjust the `default_hash_table_bucket_number` server configuration property to reflect the cluster's new size. Update this configuration's value by multiplying the new number of nodes in the cluster by the appropriate amount indicated below.
+
+	|Number of Nodes After Expansion|Suggested default\_hash\_table\_bucket\_number value|
+	|---------------|------------------------------------------|
+	|<= 85|6 \* \#nodes|
+	|\> 85 and <= 102|5 \* \#nodes|
+	|\> 102 and <= 128|4 \* \#nodes|
+	|\> 128 and <= 170|3 \* \#nodes|
+	|\> 170 and <= 256|2 \* \#nodes|
+	|\> 256 and <= 512|1 \* \#nodes|
+	|\> 512|512| 
+   
+18. If you are using hash distributed tables and wish to take advantage of the performance benefits of using a larger cluster, redistribute the data in all hash-distributed tables by using either the [ALTER TABLE](../reference/sql/ALTER-TABLE.html) or [CREATE TABLE AS](../reference/sql/CREATE-TABLE-AS.html) command. You should redistribute the table data if you modified the `default_hash_table_bucket_number` configuration parameter. 
+
+
+	**Note:** The redistribution of table data can take a significant amount of time.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/markdown/admin/ClusterShrink.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/ClusterShrink.html.md.erb b/markdown/admin/ClusterShrink.html.md.erb
new file mode 100644
index 0000000..33c5cc2
--- /dev/null
+++ b/markdown/admin/ClusterShrink.html.md.erb
@@ -0,0 +1,55 @@
+---
+title: Removing a Node
+---
+
+This topic outlines the proper procedure for removing a node from a HAWQ cluster.
+
+In general, you should not need to remove nodes manually from running HAWQ clusters. HAWQ isolates any nodes that HAWQ detects as failing due to hardware or other types of errors.
+
+## <a id="topic_p53_ct3_kv"></a>Guidelines for Removing a Node 
+
+If you do need to remove a node from a HAWQ cluster, keep in mind the following guidelines around removing nodes:
+
+-   Never remove more than two nodes at a time since the risk of data loss is high.
+-   Only remove nodes during system maintenance windows when the cluster is not busy or running queries.
+
+## <a id="task_oy5_ct3_kv"></a>Removing a Node from a Running HAWQ Cluster 
+
+The following is a typical procedure to remove a node from a running HAWQ cluster:
+
+1.  Login as gpadmin to the node that you wish to remove and source `greenplum_path.sh`.
+
+    ```shell
+    $ su - gpadmin
+    $ source /usr/local/hawq/greenplum_path.sh
+    ```
+
+2.  Make sure that there are no running QEs on the segment. Execute the following command to check for running QE processes:
+
+    ```shell
+    $ ps -ef | grep postgres
+    ```
+
+    In the output, look for processes that contain SQL commands such as INSERT or SELECT. For example:
+
+    ```shell
+    [gpadmin@rhel64-3 ~]$ ps -ef | grep postgres
+    gpadmin 3000 2999 0 Mar21 ? 00:00:08 postgres: port 40000, logger process
+    gpadmin 3003 2999 0 Mar21 ? 00:00:03 postgres: port 40000, stats collector process
+    gpadmin 3004 2999 0 Mar21 ? 00:00:50 postgres: port 40000, writer process
+    gpadmin 3005 2999 0 Mar21 ? 00:00:06 postgres: port 40000, checkpoint process
+    gpadmin 3006 2999 0 Mar21 ? 00:01:25 postgres: port 40000, segment resource manager
+    gpadmin 7880 2999 0 02:08 ? 00:00:00 postgres: port 40000, gpadmin postgres 192.0.2.0(33874) con11 seg0 cmd18 MPPEXEC INSERT
+    ```
+
+3.  Stop hawq on this segment by executing the following command:
+
+    ```shell
+    $ hawq stop segment
+    ```
+
+4.  On HAWQ master, remove the hostname of the segment from the `slaves` file. Then sync the `slaves` file to all nodes in the cluster by executing the following command:
+
+    ```shell
+    $ hawq scp -f hostfile slaves =:  $GPHOME/etc/slaves
+    ```

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/markdown/admin/FaultTolerance.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/FaultTolerance.html.md.erb b/markdown/admin/FaultTolerance.html.md.erb
new file mode 100644
index 0000000..fc9de93
--- /dev/null
+++ b/markdown/admin/FaultTolerance.html.md.erb
@@ -0,0 +1,52 @@
+---
+title: Understanding the Fault Tolerance Service
+---
+
+The fault tolerance service (FTS) enables HAWQ to continue operating in the event that a segment node fails. The fault tolerance service runs automatically and requires no additional configuration requirements.
+
+Each segment runs a resource manager process that periodically sends (by default, every 30 seconds) the segment’s status to the master's resource manager process. This interval is controlled by the `hawq_rm_segment_heartbeat_interval` server configuration parameter.
+
+When a segment encounters a critical error-- for example, a temporary directory on the segment fails due to a hardware error-- the segment reports that there is temporary directory failure to the HAWQ master through a heartbeat report. When the master receives the report, it marks the segment as DOWN in the `gp_segment_configuration` table. All changes to a segment's status are recorded in the `gp_configuration_history` catalog table, including the reason why the segment is marked as DOWN. When this segment is set to DOWN, master will not run query executors on the segment. The failed segment is fault-isolated from the rest of the cluster.
+
+Besides disk failure, there are other reasons why a segment can be marked as DOWN. For example, if HAWQ is running in YARN mode, every segment should have a NodeManager (Hadoop’s YARN service) running on it, so that the segment can be considered a resource to HAWQ. However, if the NodeManager on a segment is not operating properly, this segment will also be marked as DOWN in `gp_segment_configuration table`. The corresponding reason for the failure is recorded into `gp_configuration_history`.
+
+**Note:** If a disk fails in a particular segment, the failure may cause either an HDFS error or a temporary directory error in HAWQ. HDFS errors are handled by the Hadoop HDFS service.
+
+##Viewing the Current Status of a Segment <a id="view_segment_status"></a>
+
+To view the current status of the segment, query the `gp_segment_configuration` table.
+
+If the status of a segment is DOWN, the "description" column displays the reason. The reason can include any of the following reasons, as single reasons or as a combination of several reasons, split by a semicolon (";").
+
+**Reason: heartbeat timeout**
+
+Master has not received a heartbeat from the segment. If you see this reason, make sure that HAWQ is running on the segment.
+
+If the segment reports a heartbeat at a later time, the segment is marked as UP.
+
+**Reason: failed probing segment**
+
+Master has probed the segment to verify that it is operating normally, and the segment response is NO.
+
+While a HAWQ instance is running, the Query Dispatcher finds that some Query Executors on the segment are not working normally. The resource manager process on master sends a message to this segment. When the segment resource manager receives the message from master, it checks whether its PostgreSQL postmaster process is working normally and sends a reply message to master. When master gets a reply message that indicates that this segment's postmaster process is not working normally, then the master marks the segment as DOWN with the reason "failed probing segment."
+
+Check the logs of the failed segment and try to restart the HAWQ instance.
+
+**Reason: communication error**
+
+Master cannot connect to the segment.
+
+Check the network connection between the master and the segment.
+
+**Reason: resource manager process was reset**
+
+If the timestamp of the segment resource manager process doesn’t match the previous timestamp, it means that the resource manager process on segment has been restarted. In this case, HAWQ master needs to return the resources on this segment and marks the segment as DOWN. If the master receives a new heartbeat from this segment, it will mark it back to UP. 
+
+**Reason: no global node report**
+
+HAWQ is using YARN for resource management. No cluster report has been received for this segment. 
+
+Check that NodeManager is operating normally on this segment. 
+
+If not, try to start NodeManager on the segment. 
+After NodeManager is started, run `yarn node --list` to see if the node is in list. If so, this segment is set to UP.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb b/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
new file mode 100644
index 0000000..b4284be
--- /dev/null
+++ b/markdown/admin/HAWQFilespacesandHighAvailabilityEnabledHDFS.html.md.erb
@@ -0,0 +1,223 @@
+---
+title: HAWQ Filespaces and High Availability Enabled HDFS
+---
+
+If you initialized HAWQ without the HDFS High Availability \(HA\) feature, you can enable it by using the following procedure.
+
+## <a id="enablingthehdfsnamenodehafeature"></a>Enabling the HDFS NameNode HA Feature 
+
+To enable the HDFS NameNode HA feature for use with HAWQ, you need to perform the following tasks:
+
+1. Enable high availability in your HDFS cluster.
+1. Collect information about the target filespace.
+1. Stop the HAWQ cluster and backup the catalog (**Note:** Ambari users must perform this manual step.)
+1. Move the filespace location using the command line tool (**Note:** Ambari users must perform this manual step.)
+1. Reconfigure `${GPHOME}/etc/hdfs-client.xml` and `${GPHOME}/etc/hawq-site.xml` files. Then, synchronize updated configuration files to all HAWQ nodes.
+1. Start the HAWQ cluster and resynchronize the standby master after moving the filespace.
+
+
+### <a id="enablehahdfs"></a>Step 1: Enable High Availability in Your HDFS Cluster 
+
+Enable high availability for NameNodes in your HDFS cluster. See the documentation for your Hadoop distribution for instructions on how to do this. 
+
+**Note:** If you're using Ambari to manage your HDFS cluster, you can use the Enable NameNode HA Wizard. For example, [this Hortonworks HDP procedure](https://docs.hortonworks.com/HDPDocuments/Ambari-2.4.1.0/bk_ambari-user-guide/content/how_to_configure_namenode_high_availability.html) outlines how to do this in Ambari for HDP.
+
+### <a id="collectinginformationaboutthetargetfilespace"></a>Step 2: Collect Information about the Target Filespace 
+
+A default filespace named dfs\_system exists in the pg\_filespace catalog and the parameter, pg\_filespace\_entry, contains detailed information for each filespace. 
+
+To move the filespace location to a HA-enabled HDFS location, you must move the data to a new path on your HA-enabled HDFS cluster.
+
+1.  Use the following SQL query to gather information about the filespace located on HDFS:
+
+    ```sql
+    SELECT
+        fsname, fsedbid, fselocation
+    FROM
+        pg_filespace AS sp, pg_filespace_entry AS entry, pg_filesystem AS fs
+    WHERE
+        sp.fsfsys = fs.oid AND fs.fsysname = 'hdfs' AND sp.oid = entry.fsefsoid
+    ORDER BY
+        entry.fsedbid;
+    ```
+
+    The sample output is as follows:
+
+    ```
+		  fsname | fsedbid | fselocation
+	--------------+---------+-------------------------------------------------
+	cdbfast_fs_c | 0       | hdfs://hdfs-cluster/hawq//cdbfast_fs_c
+	cdbfast_fs_b | 0       | hdfs://hdfs-cluster/hawq//cdbfast_fs_b
+	cdbfast_fs_a | 0       | hdfs://hdfs-cluster/hawq//cdbfast_fs_a
+	dfs_system   | 0       | hdfs://test5:9000/hawq/hawq-1459499690
+	(4 rows)
+    ```
+
+    The output contains the following:
+    - HDFS paths that share the same prefix
+    - Current filespace location
+
+    **Note:** If you see `{replica=3}` in the filespace location, ignore this part of the prefix. This is a known issue.
+
+2.  To enable HA HDFS, you need the filespace name and the common prefix of your HDFS paths. The filespace location is formatted like a URL.
+
+	If the previous filespace location is 'hdfs://test5:9000/hawq/hawq-1459499690' and the HA HDFS common prefix is 'hdfs://hdfs-cluster', then the new filespace location should be 'hdfs://hdfs-cluster/hawq/hawq-1459499690'.
+
+    ```
+    Filespace Name: dfs_system
+    Old location: hdfs://test5:9000/hawq/hawq-1459499690
+    New location: hdfs://hdfs-cluster/hawq/hawq-1459499690
+    ```
+
+### <a id="stoppinghawqclusterandbackupcatalog"></a>Step 3: Stop the HAWQ Cluster and Back Up the Catalog 
+
+**Note:** Ambari users must perform this manual step.
+
+When you enable HA HDFS, you are changing the HAWQ catalog and persistent tables. You cannot perform transactions while persistent tables are being updated. Therefore, before you move the filespace location, back up the catalog. This is to ensure that you do not lose data due to a hardware failure or during an operation \(such as killing the HAWQ process\). 
+
+
+1. If you defined a custom port for HAWQ master, export the `PGPORT` environment variable. For example:
+
+	```shell
+	export PGPORT=9000
+	```
+
+1. Save the HAWQ master data directory, found in the `hawq_master_directory` property value from `hawq-site.xml` to an environment variable.
+ 
+	```bash
+	export MDATA_DIR=/path/to/hawq_master_directory
+	```
+
+1.  Disconnect all workload connections. Check the active connection with:
+
+    ```shell
+    $ psql -p ${PGPORT} -c "SELECT * FROM pg_catalog.pg_stat_activity" -d template1
+    ```
+    where `${PGPORT}` corresponds to the port number you optionally customized for HAWQ master. 
+    
+
+2.  Issue a checkpoint: 
+
+    ```shell
+    $ psql -p ${PGPORT} -c "CHECKPOINT" -d template1
+    ```
+
+3.  Shut down the HAWQ cluster: 
+
+    ```shell
+    $ hawq stop cluster -a -M fast
+    ```
+
+4.  Copy the master data directory to a backup location:
+
+    ```shell
+    $ cp -r ${MDATA_DIR} /catalog/backup/location
+    ```
+	The master data directory contains the catalog. Fatal errors can occur due to hardware failure or if you fail to kill a HAWQ process before attempting a filespace location change. Make sure you back this directory up.
+
+### <a id="movingthefilespacelocation"></a>Step 4: Move the Filespace Location 
+
+**Note:** Ambari users must perform this manual step.
+
+HAWQ provides the command line tool, `hawq filespace`, to move the location of the filespace.
+
+1. If you defined a custom port for HAWQ master, export the `PGPORT` environment variable. For example:
+
+	```shell
+	export PGPORT=9000
+	```
+1. Run the following command to move a filespace location:
+
+	```shell
+	$ hawq filespace --movefilespace default --location=hdfs://hdfs-cluster/hawq_new_filespace
+	```
+	Specify `default` as the value of the `--movefilespace` option. Replace `hdfs://hdfs-cluster/hawq_new_filespace` with the new filespace location.
+
+#### **Important:** Potential Errors During Filespace Move
+
+Non-fatal error can occur if you provide invalid input or if you have not stopped HAWQ before attempting a filespace location change. Check that you have followed the instructions from the beginning, or correct the input error before you re-run `hawq filespace`.
+
+Fatal errors can occur due to hardware failure or if you fail to kill a HAWQ process before attempting a filespace location change. When a fatal error occurs, you will see the message, "PLEASE RESTORE MASTER DATA DIRECTORY" in the output. If this occurs, shut down the database and restore the `${MDATA_DIR}` that you backed up in Step 4.
+
+### <a id="configuregphomeetchdfsclientxml"></a>Step 5: Update HAWQ to Use NameNode HA by Reconfiguring hdfs-client.xml and hawq-site.xml 
+
+If you install and manage your cluster using command-line utilities, follow these steps to modify your HAWQ configuration to use the NameNode HA service.
+
+**Note:** These steps are not required if you use Ambari to manage HDFS and HAWQ, because Ambari makes these changes automatically after you enable NameNode HA.
+
+For command-line administrators:
+
+1. Edit the ` ${GPHOME}/etc/hdfs-client.xml` file on each segment and add the following NameNode properties:
+
+    ```xml
+    <property>
+     <name>dfs.ha.namenodes.hdpcluster</name>
+     <value>nn1,nn2</value>
+    </property>
+
+    <property>
+     <name>dfs.namenode.http-address.hdpcluster.nn1</name>
+     <value>ip-address-1.mycompany.com:50070</value>
+    </property>
+
+    <property>
+     <name>dfs.namenode.http-address.hdpcluster.nn2</name>
+     <value>ip-address-2.mycompany.com:50070</value>
+    </property>
+
+    <property>
+     <name>dfs.namenode.rpc-address.hdpcluster.nn1</name>
+     <value>ip-address-1.mycompany.com:8020</value>
+    </property>
+
+    <property>
+     <name>dfs.namenode.rpc-address.hdpcluster.nn2</name>
+     <value>ip-address-2.mycompany.com:8020</value>
+    </property>
+
+    <property>
+     <name>dfs.nameservices</name>
+     <value>hdpcluster</value>
+    </property>
+     ```
+
+    In the listing above:
+    * Replace `hdpcluster` with the actual service ID that is configured in HDFS.
+    * Replace `ip-address-2.mycompany.com:50070` with the actual NameNode RPC host and port number that is configured in HDFS.
+    * Replace `ip-address-1.mycompany.com:8020` with the actual NameNode HTTP host and port number that is configured in HDFS.
+    * The order of the NameNodes listed in `dfs.ha.namenodes.hdpcluster` is important for performance, especially when running secure HDFS. The first entry (`nn1` in the example above) should correspond to the active NameNode.
+
+2.  Change the following parameter in the `$GPHOME/etc/hawq-site.xml` file:
+
+    ```xml
+    <property>
+        <name>hawq_dfs_url</name>
+        <value>hdpcluster/hawq_default</value>
+        <description>URL for accessing HDFS.</description>
+    </property>
+    ```
+
+    In the listing above:
+    * Replace `hdpcluster` with the actual service ID that is configured in HDFS.
+    * Replace `/hawq_default` with the directory you want to use for storing data on HDFS. Make sure this directory exists and is writable.
+
+3. Copy the updated configuration files to all nodes in the cluster (as listed in `hawq_hosts`).
+
+	```shell
+	$ hawq scp -f hawq_hosts hdfs-client.xml hawq-site.xml =:$GPHOME/etc/
+	```
+
+### <a id="reinitializethestandbymaster"></a>Step 6: Restart the HAWQ Cluster and Resynchronize the Standby Master 
+
+1. Restart the HAWQ cluster:
+
+	```shell
+	$ hawq start cluster -a
+	```
+
+1. Moving the filespace to a new location renders the standby master catalog invalid. To update the standby, resync the standby master.  On the active master, run the following command to ensure that the standby master's catalog is resynced with the active master.
+
+	```shell
+	$ hawq init standby -n -M fast
+
+	```

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/markdown/admin/HighAvailability.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/HighAvailability.html.md.erb b/markdown/admin/HighAvailability.html.md.erb
new file mode 100644
index 0000000..0c2e32b
--- /dev/null
+++ b/markdown/admin/HighAvailability.html.md.erb
@@ -0,0 +1,37 @@
+---
+title: High Availability in HAWQ
+---
+
+A HAWQ cluster can be made highly available by providing fault-tolerant hardware, by enabling HAWQ or HDFS high-availability features, and by performing regular monitoring and maintenance procedures to ensure the health of all system components.
+
+Hardware components eventually fail either due to normal wear or to unexpected circumstances. Loss of power can lead to temporarily unavailable components. You can make a system highly available by providing redundant standbys for components that can fail so services can continue uninterrupted when a failure does occur. In some cases, the cost of redundancy is higher than a user’s tolerance for interruption in service. When this is the case, the goal is to ensure that full service is able to be restored, and can be restored within an expected timeframe.
+
+With HAWQ, fault tolerance and data availability is achieved with:
+
+* [Hardware Level Redundancy (RAID and JBOD)](#ha_raid)
+* [Master Mirroring](#ha_master_mirroring)
+* [Dual Clusters](#ha_dual_clusters)
+
+## <a id="ha_raid"></a>Hardware Level Redundancy (RAID and JBOD) 
+
+As a best practice, HAWQ deployments should use RAID for master nodes and JBOD for segment nodes. Using these hardware-level systems provides high performance redundancy for single disk failure without having to go into database level fault tolerance. RAID and JBOD provide a lower level of redundancy at the disk level.
+
+## <a id="ha_master_mirroring"></a>Master Mirroring 
+
+There are two masters in a highly available cluster, a primary and a standby. As with segments, the master and standby should be deployed on different hosts so that the cluster can tolerate a single host failure. Clients connect to the primary master and queries can be executed only on the primary master. The secondary master is kept up-to-date by replicating the write-ahead log (WAL) from the primary to the secondary.
+
+## <a id="ha_dual_clusters"></a>Dual Clusters 
+
+You can add another level of redundancy to your deployment by maintaining two HAWQ clusters, both storing the same data.
+
+The two main methods for keeping data synchronized on dual clusters are "dual ETL" and "backup/restore."
+
+Dual ETL provides a complete standby cluster with the same data as the primary cluster. ETL (extract, transform, and load) refers to the process of cleansing, transforming, validating, and loading incoming data into a data warehouse. With dual ETL, this process is executed twice in parallel, once on each cluster, and is validated each time. It also allows data to be queried on both clusters, doubling the query throughput.
+
+Applications can take advantage of both clusters and also ensure that the ETL is successful and validated on both clusters.
+
+To maintain a dual cluster with the backup/restore method, create backups of the primary cluster and restore them on the secondary cluster. This method takes longer to synchronize data on the secondary cluster than the dual ETL strategy, but requires less application logic to be developed. Populating a second cluster with backups is ideal in use cases where data modifications and ETL are performed daily or less frequently.
+
+See [Backing Up and Restoring HAWQ](BackingUpandRestoringHAWQDatabases.html) for instructions on how to backup and restore HAWQ.
+
+

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/markdown/admin/MasterMirroring.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/MasterMirroring.html.md.erb b/markdown/admin/MasterMirroring.html.md.erb
new file mode 100644
index 0000000..b9352f0
--- /dev/null
+++ b/markdown/admin/MasterMirroring.html.md.erb
@@ -0,0 +1,144 @@
+---
+title: Using Master Mirroring
+---
+
+There are two masters in a HAWQ cluster-- a primary master and a standby master. Clients connect to the primary master and queries can be executed only on the primary master.
+
+You deploy a backup or mirror of the master instance on a separate host machine from the primary master so that the cluster can tolerate a single host failure. A backup master or standby master serves as a warm standby if the primary master becomes non-operational. You create a standby master from the primary master while the primary is online.
+
+The primary master continues to provide services to users while HAWQ takes a transactional snapshot of the primary master instance. In addition to taking a transactional snapshot and deploying it to the standby master, HAWQ also records changes to the primary master. After HAWQ deploys the snapshot to the standby master, HAWQ deploys the updates to synchronize the standby master with the primary master.
+
+After the primary master and standby master are synchronized, HAWQ keeps the standby master up to date using walsender and walreceiver, write-ahead log (WAL)-based replication processes. The walreceiver is a standby master process. The walsender process is a primary master process. The two processes use WAL-based streaming replication to keep the primary and standby masters synchronized.
+
+Since the master does not house user data, only system catalog tables are synchronized between the primary and standby masters. When these tables are updated, changes are automatically copied to the standby master to keep it current with the primary.
+
+*Figure 1: Master Mirroring in HAWQ*
+
+![](../mdimages/standby_master.jpg)
+
+
+If the primary master fails, the replication process stops, and an administrator can activate the standby master. Upon activation of the standby master, the replicated logs reconstruct the state of the primary master at the time of the last successfully committed transaction. The activated standby then functions as the HAWQ master, accepting connections on the port specified when the standby master was initialized.
+
+If the master fails, the administrator uses command line tools or Ambari to instruct the standby master to take over as the new primary master. 
+
+**Tip:** You can configure a virtual IP address for the master and standby so that client programs do not have to switch to a different network address when the ‘active’ master changes. If the master host fails, the virtual IP address can be swapped to the actual acting master.
+
+##Configuring Master Mirroring <a id="standby_master_configure"></a>
+
+You can configure a new HAWQ system with a standby master during HAWQ’s installation process, or you can add a standby master later. This topic assumes you are adding a standby master to an existing node in your HAWQ cluster.
+
+###Add a standby master to an existing system
+
+1. Ensure the host machine for the standby master has been installed with HAWQ and configured accordingly:
+    * The gpadmin system user has been created.
+    * HAWQ binaries are installed.
+    * HAWQ environment variables are set.
+    * SSH keys have been exchanged.
+    * HAWQ Master Data directory has been created.
+
+2. Initialize the HAWQ master standby:
+
+    a. If you use Ambari to manage your cluster, follow the instructions in [Adding a HAWQ Standby Master](ambari-admin.html#amb-add-standby).
+
+    b. If you do not use Ambari, log in to the HAWQ master and re-initialize the HAWQ master standby node:
+ 
+    ``` shell
+    $ ssh gpadmin@<hawq_master>
+    hawq_master$ . /usr/local/hawq/greenplum_path.sh
+    hawq_master$ hawq init standby -s <new_standby_master>
+    ```
+
+    where \<new\_standby\_master\> identifies the hostname of the standby master.
+
+3. Check the status of master mirroring by querying the `gp_master_mirroring system` view. See [Checking on the State of Master Mirroring](#standby_check) for instructions.
+
+4. To activate or failover to the standby master, see [Failing Over to a Standby Master](#standby_failover).
+
+##Failing Over to a Standby Master<a id="standby_failover"></a>
+
+If the primary master fails, log replication stops. You must explicitly activate the standby master in this circumstance.
+
+Upon activation of the standby master, HAWQ reconstructs the state of the master at the time of the last successfully committed transaction.
+
+###To activate the standby master
+
+1. Ensure that a standby master host has been configured for the system.
+
+2. Activate the standby master:
+
+    a. If you use Ambari to manage your cluster, follow the instructions in [Activating the HAWQ Standby Master](ambari-admin.html#amb-activate-standby).
+
+    b. If you do not use Ambari, log in to the HAWQ master and activate the HAWQ master standby node:
+
+	``` shell
+	hawq_master$ hawq activate standby
+ 	```
+   After you activate the standby master, it becomes the active or primary master for the HAWQ cluster.
+
+4. (Optional, but recommended.) Configure a new standby master. See [Add a standby master to an existing system](#standby_master_configure) for instructions.
+	
+5. Check the status of the HAWQ cluster by executing the following command on the master:
+
+	```shell
+	hawq_master$ hawq state
+	```
+	
+	The newly-activated master's status should be **Active**. If you configured a new standby master, its status is **Passive**. When a standby master is not configured, the command displays `-No entries found`, the message indicating that no standby master instance is configured.
+
+6. Query the `gp_segment_configuration` table to verify that segments have registered themselves to the new master:
+
+    ``` shell
+    hawq_master$ psql dbname -c 'SELECT * FROM gp_segment_configuration;'
+    ```
+	
+7. Finally, check the status of master mirroring by querying the `gp_master_mirroring` system view. See [Checking on the State of Master Mirroring](#standby_check) for instructions.
+
+
+##Checking on the State of Master Mirroring <a id="standby_check"></a>
+
+To check on the status of master mirroring, query the `gp_master_mirroring` system view. This view provides information about the walsender process used for HAWQ master mirroring. 
+
+```shell
+hawq_master$ psql dbname -c 'SELECT * FROM gp_master_mirroring;'
+```
+
+If a standby master has not been set up for the cluster, you will see the following output:
+
+```
+ summary_state  | detail_state | log_time | error_message
+----------------+--------------+----------+---------------
+ Not Configured |              |          | 
+(1 row)
+```
+
+If the standby is configured and in sync with the master, you will see output similar to the following:
+
+```
+ summary_state | detail_state | log_time               | error_message
+---------------+--------------+------------------------+---------------
+ Synchronized  |              | 2016-01-22 21:53:47+00 |
+(1 row)
+```
+
+##Resynchronizing Standby with the Master <a id="resync_master"></a>
+
+The standby can become out-of-date if the log synchronization process between the master and standby has stopped or has fallen behind. If this occurs, you will observe output similar to the following after querying the `gp_master_mirroring` view:
+
+```
+   summary_state  | detail_state | log_time               | error_message
+------------------+--------------+------------------------+---------------
+ Not Synchronized |              |                        |
+(1 row)
+```
+
+To resynchronize the standby with the master:
+
+1. If you use Ambari to manage your cluster, follow the instructions in [Removing the HAWQ Standby Master](ambari-admin.html#amb-remove-standby).
+
+2. If you do not use Ambari, execute the following command on the HAWQ master:
+
+    ```shell
+    hawq_master$ hawq init standby -n
+    ```
+
+    This command stops and restarts the master and then synchronizes the standby.

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/markdown/admin/RecommendedMonitoringTasks.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/RecommendedMonitoringTasks.html.md.erb b/markdown/admin/RecommendedMonitoringTasks.html.md.erb
new file mode 100644
index 0000000..5083b44
--- /dev/null
+++ b/markdown/admin/RecommendedMonitoringTasks.html.md.erb
@@ -0,0 +1,259 @@
+---
+title: Recommended Monitoring and Maintenance Tasks
+---
+
+This section lists monitoring and maintenance activities recommended to ensure high availability and consistent performance of your HAWQ cluster.
+
+The tables in the following sections suggest activities that a HAWQ System Administrator can perform periodically to ensure that all components of the system are operating optimally. Monitoring activities help you to detect and diagnose problems early. Maintenance activities help you to keep the system up-to-date and avoid deteriorating performance, for example, from bloated system tables or diminishing free disk space.
+
+It is not necessary to implement all of these suggestions in every cluster; use the frequency and severity recommendations as a guide to implement measures according to your service requirements.
+
+## <a id="drr_5bg_rp"></a>Database State Monitoring Activities 
+
+<table>
+  <tr>
+    <th>Activity</th>
+    <th>Procedure</th>
+    <th>Corrective Actions</th>
+  </tr>
+  <tr>
+    <td><p>List segments that are currently down. If any rows are returned, this should generate a warning or alert.</p>
+    <p>Recommended frequency: run every 5 to 10 minutes</p><p>Severity: IMPORTANT</p></td>
+    <td>Run the following query in the `postgres` database:
+    <pre><code>SELECT * FROM gp_segment_configuration
+WHERE status <> 'u';
+</code></pre>
+  </td>
+  <td>If the query returns any rows, follow these steps to correct the problem:
+  <ol>
+    <li>Verify that the hosts with down segments are responsive.</li>
+    <li>If hosts are OK, check the pg_log files for the down segments to discover the root cause of the segments going down.</li>
+    </ol>
+    </td>
+    </tr>
+  <tr>
+    <td>
+      <p>Run a distributed query to test that it runs on all segments. One row should be returned for each segment.</p>
+      <p>Recommended frequency: run every 5 to 10 minutes</p>
+      <p>Severity: CRITICAL</p>
+    </td>
+    <td>
+      <p>Execute the following query in the `postgres` database:</p>
+      <pre><code>SELECT gp_segment_id, count(&#42;)
+FROM gp_dist_random('pg_class')
+GROUP BY 1;
+</code></pre>
+  </td>
+  <td>If this query fails, there is an issue dispatching to some segments in the cluster. This is a rare event. Check the hosts that are not able to be dispatched to ensure there is no hardware or networking issue.</td>
+  </tr>
+  <tr>
+    <td>
+      <p>Perform a basic check to see if the master is up and functioning.</p>
+      <p>Recommended frequency: run every 5 to 10 minutes</p>
+      <p>Severity: CRITICAL</p>
+    </td>
+    <td>
+      <p>Run the following query in the `postgres` database:</p>
+      <pre><code>SELECT count(&#42;) FROM gp_segment_configuration;</code></pre>
+    </td>
+    <td>
+      <p>If this query fails the active master may be down. Try again several times and then inspect the active master manually. If the active master is down, reboot or power cycle the active master to ensure no processes remain on the active master and then trigger the activation of the standby master.</p>
+    </td>
+  </tr>
+</table>
+
+## <a id="topic_y4c_4gg_rp"></a>Hardware and Operating System Monitoring 
+
+<table>
+  <tr>
+    <th>Activity</th>
+    <th>Procedure</th>
+    <th>Corrective Actions</th>
+  </tr>
+  <tr>
+    <td>
+      <p>Underlying platform check for maintenance required or system down of the hardware.</p>
+      <p>Recommended frequency: real-time, if possible, or every 15 minutes</p>
+      <p>Severity: CRITICAL</p>
+    </td>
+    <td>
+      <p>Set up system check for hardware and OS errors.</p>
+    </td>
+    <td>
+      <p>If required, remove a machine from the HAWQ cluster to resolve hardware and OS issues, then add it back to the cluster.</p>
+    </td>
+  </tr>
+  <tr>
+    <td>
+      <p>Check disk space usage on volumes used for HAWQ data storage and the OS. Recommended frequency: every 5 to 30 minutes</p>
+      <p>Severity: CRITICAL</p>
+    </td>
+    <td>
+      <p>Set up a disk space check.</p>
+      <ul>
+        <li>Set a threshold to raise an alert when a disk reaches a percentage of capacity. The recommended threshold is 75% full.</li>
+        <li>It is not recommended to run the system with capacities approaching 100%.</li>
+      </ul>
+    </td>
+    <td>
+      <p>Free space on the system by removing some data or files.</p>
+    </td>
+  </tr>
+  <tr>
+    <td>
+      <p>Check for errors or dropped packets on the network interfaces.</p>
+      <p>Recommended frequency: hourly</p>
+      <p>Severity: IMPORTANT</p>
+    </td>
+    <td>
+      <p>Set up a network interface checks.</p>
+    </td>
+    <td>
+      <p>Work with network and OS teams to resolve errors.</p>
+    </td>
+  </tr>
+  <tr>
+    <td>
+      <p>Check for RAID errors or degraded RAID performance.</p>
+      <p>Recommended frequency: every 5 minutes</p>
+      <p>Severity: CRITICAL</p>
+    </td>
+    <td>
+      <p>Set up a RAID check.</p>
+    </td>
+    <td>
+      <ul>
+        <li>Replace failed disks as soon as possible.</li>
+        <li>Work with system administration team to resolve other RAID or controller errors as soon as possible.</li>
+      </ul>
+    </td>
+  </tr>
+  <tr>
+    <td>
+      <p>Check for adequate I/O bandwidth and I/O skew.</p>
+      <p>Recommended frequency: when create a cluster or when hardware issues are suspected.</p>
+    </td>
+    <td>
+      <p>Run the HAWQ `hawq checkperf` utility.</p>
+    </td>
+    <td>
+      <p>The cluster may be under-specified if data transfer rates are not similar to the following:</p>
+      <ul>
+        <li>2GB per second disk read</li>
+        <li>1 GB per second disk write</li>
+        <li>10 Gigabit per second network read and write</li>
+      </ul>
+      <p>If transfer rates are lower than expected, consult with your data architect regarding performance expectations.</p>
+      <p>If the machines on the cluster display an uneven performance profile, work with the system administration team to fix faulty machines.</p>
+    </td>
+  </tr>
+</table>
+
+## <a id="maintentenance_check_scripts"></a>Data Maintenance 
+
+<table>
+  <tr>
+    <th>Activity</th>
+    <th>Procedure</th>
+    <th>Corrective Actions</th>
+  </tr>
+  <tr>
+    <td>Check for missing statistics on tables.</td>
+    <td>Check the `hawq_stats_missing` view in each database:
+    <pre><code>SELECT * FROM hawq_toolkit.hawq_stats_missing;</code></pre>
+    </td>
+    <td>Run <code>ANALYZE</code> on tables that are missing statistics.</td>
+  </tr>
+</table>
+
+## <a id="topic_dld_23h_rp"></a>Database Maintenance 
+
+<table>
+  <tr>
+    <th>Activity</th>
+    <th>Procedure</th>
+    <th>Corrective Actions</th>
+  </tr>
+  <tr>
+    <td>
+      <p>Mark deleted rows in HAWQ system catalogs (tables in the `pg_catalog` schema) so that the space they occupy can be reused.</p>
+      <p>Recommended frequency: daily</p>
+      <p>Severity: CRITICAL</p>
+    </td>
+    <td>
+      <p>Vacuum each system catalog:</p>
+      <pre><code>VACUUM &lt;<i>table</i>&gt;;</code></pre>
+    </td>
+    <td>Vacuum system catalogues regularly to prevent bloating.</td>
+  </tr>
+  <tr>
+    <td>
+    <p>Vacuum all system catalogs (tables in the <code>pg_catalog</code> schema) that are approaching <a href="../reference/guc/parameter_definitions.html">vacuum_freeze_min_age</a>.</p>
+    <p>Recommended frequency: daily</p>
+    <p>Severity: CRITICAL</p>
+    </td>
+    <td>
+      <p><p>Vacuum an individual system catalog table:</p>
+      <pre><code>VACUUM &lt;<i>table</i>&gt;;</code></pre>
+    </td>
+    <td>After the <a href="../reference/guc/parameter_definitions.html">vacuum_freeze_min_age</a> value is reached, VACUUM will no longer replace transaction IDs with <code>FrozenXID</code> while scanning a table. Perform vacuum on these tables before the limit is reached.</td>
+  </tr>
+    <td>
+      <p>Update table statistics.</p>
+      <p>Recommended frequency: after loading data and before executing queries</p>
+      <p>Severity: CRITICAL</p>
+    </td>
+    <td>
+      <p>Analyze user tables:</p>
+      <pre><code>ANALYZEDB -d &lt;<i>database</i>&gt; -a</code></pre>
+    </td>
+    <td>Analyze updated tables regularly so that the optimizer can produce efficient query execution plans.</td>
+  </tr>
+  <tr>
+    <td>
+      <p>Backup the database data.</p>
+      <p>Recommended frequency: daily, or as required by your backup plan</p>
+      <p>Severity: CRITICAL</p>
+    </td>
+    <td>See <a href="BackingUpandRestoringHAWQDatabases.html">Backing Up and Restoring HAWQ</a> for a discussion of backup procedures.</td>
+    <td>Best practice is to have a current backup ready in case the database must be restored.</td>
+  </tr>
+  <tr>
+    <td>
+      <p>Vacuum system catalogs (tables in the <code>pg_catalog</code> schema) to maintain an efficient catalog.</p>
+      <p>Recommended frequency: weekly, or more often if database objects are created and dropped frequently</p>
+    </td>
+    <td>
+      <p><code>VACUUM</code> the system tables in each database.</p>
+    </td>
+    <td>The optimizer retrieves information from the system tables to create query plans. If system tables and indexes are allowed to become bloated over time, scanning the system tables increases query execution time.</td>
+  </tr>
+</table>
+
+## <a id="topic_idx_smh_rp"></a>Patching and Upgrading 
+
+<table>
+  <tr>
+    <th>Activity</th>
+    <th>Procedure</th>
+    <th>Corrective Actions</th>
+  </tr>
+  <tr>
+    <td>
+      <p>Ensure any bug fixes or enhancements are applied to the kernel.</p>
+      <p>Recommended frequency: at least every 6 months</p>
+      <p>Severity: IMPORTANT</p>
+    </td>
+    <td>Follow the vendor's instructions to update the Linux kernel.</td>
+    <td>Keep the kernel current to include bug fixes and security fixes, and to avoid difficult future upgrades.</td>
+  </tr>
+  <tr>
+    <td>
+      <p>Install HAWQ minor releases.</p>
+      <p>Recommended frequency: quarterly</p>
+      <p>Severity: IMPORTANT</p>
+    </td>
+    <td>Always upgrade to the latest in the series.</td>
+    <td>Keep the HAWQ software current to incorporate bug fixes, performance enhancements, and feature enhancements into your HAWQ cluster.</td>
+  </tr>
+</table>

http://git-wip-us.apache.org/repos/asf/incubator-hawq-docs/blob/de1e2e07/markdown/admin/RunningHAWQ.html.md.erb
----------------------------------------------------------------------
diff --git a/markdown/admin/RunningHAWQ.html.md.erb b/markdown/admin/RunningHAWQ.html.md.erb
new file mode 100644
index 0000000..c7de1d5
--- /dev/null
+++ b/markdown/admin/RunningHAWQ.html.md.erb
@@ -0,0 +1,37 @@
+---
+title: Running a HAWQ Cluster
+---
+
+This section provides information for system administrators responsible for administering a HAWQ deployment.
+
+You should have some knowledge of Linux/UNIX system administration, database management systems, database administration, and structured query language \(SQL\) to administer a HAWQ cluster. Because HAWQ is based on PostgreSQL, you should also have some familiarity with PostgreSQL. The HAWQ documentation calls out similarities between HAWQ and PostgreSQL features throughout.
+
+## <a id="hawq_users"></a>HAWQ Users
+
+HAWQ supports users with both administrative and operating privileges. The HAWQ administrator may choose to manage the HAWQ cluster using either Ambari or the command line. [Managing HAWQ Using Ambari](../admin/ambari-admin.html) provides Ambari-specific HAWQ cluster administration procedures. [Starting and Stopping HAWQ](startstop.html), [Expanding a Cluster](ClusterExpansion.html), and [Removing a Node](ClusterShrink.html) describe specific command-line-managed HAWQ cluster administration procedures. Other topics in this guide are applicable to both Ambari- and command-line-managed HAWQ clusters.
+
+The default HAWQ admininstrator user is named `gpadmin`. The HAWQ admin may choose to assign administrative and/or operating HAWQ privileges to additional users.  Refer to [Configuring Client Authentication](../clientaccess/client_auth.html) and [Managing Roles and Privileges](../clientaccess/roles_privs.html) for additional information about HAWQ user configuration.
+
+## <a id="hawq_systems"></a>HAWQ Deployment Systems
+
+A typical HAWQ deployment includes single HDFS and HAWQ master and standby nodes and multiple HAWQ segment and HDFS data nodes. The HAWQ cluster may also include systems running the HAWQ Extension Framework (PXF) and other Hadoop services. Refer to [HAWQ Architecture](../overview/HAWQArchitecture.html) and [Select HAWQ Host Machines](../install/select-hosts.html) for information about the different systems in a HAWQ deployment and how they are configured.
+
+
+## <a id="hawq_env_databases"></a>HAWQ Databases
+
+[Creating and Managing Databases](../ddl/ddl-database.html) and [Creating and Managing Tables](../ddl/ddl-table.html) describe HAWQ database and table creation commands.
+
+You manage HAWQ databases at the command line using the [psql](../reference/cli/client_utilities/psql.html) utility, an interactive front-end to the HAWQ database. Configuring client access to HAWQ databases and tables may require information related to [Establishing a Database Session](../clientaccess/g-establishing-a-database-session.html).
+
+[HAWQ Database Drivers and APIs](../clientaccess/g-database-application-interfaces.html) identifies supported HAWQ database drivers and APIs for additional client access methods.
+
+## <a id="hawq_env_data"></a>HAWQ Data
+
+HAWQ internal data resides in HDFS. You may require access to data in different formats and locations in your data lake. You can use HAWQ and the HAWQ Extension Framework (PXF) to access and manage both internal and this external data:
+
+- [Managing Data with HAWQ](../datamgmt/dml.html) discusses the basic data operations and details regarding the loading and unloading semantics for HAWQ internal tables.
+- [Using PXF with Unmanaged Data](../pxf/HawqExtensionFrameworkPXF.html) describes PXF, an extensible framework you may use to query data external to HAWQ.
+
+## <a id="hawq_env_setup"></a>HAWQ Operating Environment
+
+Refer to [Introducing the HAWQ Operating Environment](setuphawqopenv.html) for a discussion of the HAWQ operating environment, including a procedure to set up the HAWQ environment. This section also provides an introduction to the important files and directories in a HAWQ installation.


Mime
View raw message