Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 172E1200C02 for ; Fri, 20 Jan 2017 17:52:03 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 15B74160B55; Fri, 20 Jan 2017 16:52:03 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E7FBF160B48 for ; Fri, 20 Jan 2017 17:52:00 +0100 (CET) Received: (qmail 75009 invoked by uid 500); 20 Jan 2017 16:52:00 -0000 Mailing-List: contact commits-help@hawq.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hawq.incubator.apache.org Delivered-To: mailing list commits@hawq.incubator.apache.org Received: (qmail 75000 invoked by uid 99); 20 Jan 2017 16:52:00 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jan 2017 16:52:00 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 9ECE71A0232 for ; Fri, 20 Jan 2017 16:51:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -6.219 X-Spam-Level: X-Spam-Status: No, score=-6.219 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id bn-YTSoNGIEo for ; Fri, 20 Jan 2017 16:51:46 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id B1CC35FC5A for ; Fri, 20 Jan 2017 16:51:38 +0000 (UTC) Received: (qmail 73088 invoked by uid 99); 20 Jan 2017 16:51:38 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jan 2017 16:51:38 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id C5DB1F403B; Fri, 20 Jan 2017 16:51:37 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: yozie@apache.org To: commits@hawq.incubator.apache.org Date: Fri, 20 Jan 2017 16:51:51 -0000 Message-Id: <073b028ce5ff407fa34f341109df4058@git.apache.org> In-Reply-To: References: X-Mailer: ASF-Git Admin Mailer Subject: [15/22] incubator-hawq-site git commit: rebuilding html after merge of following fixes: HAWQ-1119, HAWQ-1263, HAWQ-1252, HAWQ-1251, HAWQ-1272, HAWQ-1263 archived-at: Fri, 20 Jan 2017 16:52:03 -0000 http://git-wip-us.apache.org/repos/asf/incubator-hawq-site/blob/cce3ea48/docs/userguide/2.1.0.0-incubating/install/aws-config.html ---------------------------------------------------------------------- diff --git a/docs/userguide/2.1.0.0-incubating/install/aws-config.html b/docs/userguide/2.1.0.0-incubating/install/aws-config.html index 5994d65..33abcf9 100644 --- a/docs/userguide/2.1.0.0-incubating/install/aws-config.html +++ b/docs/userguide/2.1.0.0-incubating/install/aws-config.html @@ -129,11 +129,13 @@
  • Introducing the HAWQ Operating Environment
  • -
  • +
  • Managing HAWQ Using Ambari -
  • -
  • - Using the Ambari REST API +
  • Starting and Stopping HAWQ @@ -413,6 +415,7 @@
  • Accessing Hive Data
  • Accessing HBase Data
  • Accessing JSON Data
  • +
  • Writing Data to HDFS
  • Using Profiles to Read and Write Data
  • PXF External Tables and API
  • Troubleshooting PXF
  • @@ -740,6 +743,7 @@ @@ -988,31 +996,41 @@

    Create and Launch HAWQ Instances

    -

    Use the Amazon EC2 Console to launch instances and configure, start, stop, and terminate (delete) virtual servers. When you launch a HAWQ instance, you select and configure key attributes via the EC2 Console.

    +

    Use the Amazon EC2 Console to launch instances and configure, start, stop, and terminate (delete) virtual servers. When you launch a HAWQ instance, you select and configure key attributes via the EC2 Console.

    Choose AMI Type

    -

    An Amazon Machine Image (AMI) is a template that contains a software configuration including the operating system, application server, and applications that best suit your purpose. When configuring a HAWQ virtual instance, we recommend you use a hardware virtualized AMI running 64-bit Red Hat Enterprise Linux version 6.4 or 6.5 or 64-bit CentOS 6.4 or 6.5. Obtain the licenses and instances directly from the OS provider.

    +

    An Amazon Machine Image (AMI) is a template that contains a specific software configuration including the operating system, application server, and applications that best suit your purpose. When configuring a HAWQ virtual instance, use a hardware virtualized (HVM) AMI supporting enhanced 10Gbps networking. Ensure the AMI is running 64-bit Red Hat Enterprise Linux version 6.4 or 6.5 or 64-bit CentOS 6.4 or 6.5. Obtain the licenses and instances directly from the OS provider.

    Consider Storage

    -

    EC2 instances can be launched as either Elastic Block Store (EBS)-backed or instance store-backed.

    +

    You can launch EC2 instances as either Elastic Block Store (EBS)-backed or instance store-backed. Choose the storage type based on the expected lifetime of your cluster and data.

    + +

    Instance Store-Backed

    + +

    Use instance store-backed storage for short lived or transient clusters that do not require long-term persistence of data. While instance store-backed storage is generally better performing than EBS, it is not recommended for use in a production environment.

    + +

    Warning: EC2 instance store-backed storage provides temporary block-level storage. This storage is located on disks that are physically attached to the host computer. You will lose all instance store data when the AMI instance is powered off.

    -

    Instance store-backed storage is generally better performing than EBS and recommended for HAWQ’s large data workloads. SSD (solid state) instance store is preferred over magnetic drives.

    +

    EBS-Backed

    -

    Note EC2 instance store provides temporary block-level storage. This storage is located on disks that are physically attached to the host computer. While instance store provides high performance, powering off the instance causes data loss. Soft reboots preserve instance store data.

    +

    EBS volumes are reliable and highly-available. Use EBS-backed storage for longer running clusters where data must be quickly accessible and must remain available for a long period of time.

    -

    Virtual devices for instance store volumes for HAWQ EC2 instance store instances are named ephemeralN (where N varies based on instance type). CentOS instance store block device are named /dev/xvdletter (where letter is a lower case letter of the alphabet).

    +

    Volume Types

    + +

    When selecting between HDD and SSD volume types, the trade-offs are between speed, capacity, and cost. HDD volumes are less expensive and have greater disk capacity, but may be less performant. SSD (solid state drive) volumes are more performant, but costlier and typically have less disk capacity.

    Configure Placement Group

    -

    A placement group is a logical grouping of instances within a single availability zone that together participate in a low-latency, 10 Gbps network. Your HAWQ master and segment cluster instances should support enhanced networking and reside in a single placement group (and subnet) for optimal network performance.

    +

    A placement group is a logical grouping of instances within a single availability zone that participate in a low-latency, 10 Gbps network. Your HAWQ master and segment cluster instances should support enhanced networking and reside in a single placement group (and subnet) for optimal network performance.

    If your Ambari node is not a DataNode, locating the Ambari node instance in a subnet separate from the HAWQ master/segment placement group enables you to manage multiple HAWQ clusters from the single Ambari instance.

    Amazon recommends that you use the same instance type for all instances in the placement group and that you launch all instances within the placement group at the same time.

    -

    Membership in a placement group has some implications on your HAWQ cluster. Specifically, growing the cluster over capacity may require shutting down all HAWQ instances in the current placement group and restarting the instances to a new placement group. Instance store volumes are lost in this scenario.

    +

    Membership in a placement group has some implications for your HAWQ cluster. Specifically, growing the cluster beyond the placement group capacity may require shutting down all HAWQ instances in the current placement group and restarting the instances in a new placement group. Warning: Instance store volumes are lost in this scenario.

    + +

    Note: If cluster down time during expansion is not acceptable for your HAWQ deployment, do not use placement groups.

    Select EC2 Instance Type

    @@ -1028,23 +1046,17 @@ Memory (GB) Disk Capacity (GB) Storage Type +Network Speed -cc2.8xlarge -Dev -32 -60.5 -4 x 840 -HDD - - d2.2xlarge Dev 8 60 6 x 2000 HDD +High d2.4xlarge @@ -1053,22 +1065,34 @@ 122 12 x 2000 HDD +High -i2.8xlarge -Prod +c3.8xlarge +Dev/QA +32 +60 +2 x 320 +SSD +10 Gigabit + + +r3.8xlarge +Dev/QA 32 244 -8 x 800 +2 x 320 SSD +10 Gigabit -hs1.8xlarge +i2.8xlarge Prod -16 -117 -24 x 2000 -HDD +32 +244 +8 x 800 +SSD +10 Gigabit d2.8xlarge @@ -1077,54 +1101,90 @@ 244 24 x 2000 HDD +10 Gigabit -

    For optimal network performance, the chosen HAWQ instance type should support EC2 enhanced networking. Enhanced networking results in higher performance, lower latency, and lower jitter. Refer to Enhanced Networking on Linux Instances for detailed information on enabling enhanced networking in your instances.

    +

    Note: This list is not exhaustive. You may find other instance types with similar specifications suitable for your HAWQ deployment.

    -

    All instance types identified in the table above support enhanced networking.

    +

    For optimal network performance, the chosen HAWQ instance type should support EC2 enhanced networking. Enhanced networking results in higher performance, lower latency, and lower jitter. Refer to Enhanced Networking on Linux Instances for detailed information on enabling enhanced networking in supported EC2 instance types.

    -

    Configure Networking

    +

    Configure Networking and VPC

    -

    Your HAWQ cluster instances should be in a single VPC and on the same subnet. Instances are always assigned a VPC internal IP address. This internal IP address should be used for HAWQ communication between hosts. You can also use the internal IP address to access an instance from another instance within the HAWQ VPC.

    +

    Place your HAWQ cluster instances in a single VPC and on the same subnet. Instances are always assigned a VPC internal IP address. Use this internal IP address for HAWQ communication between hosts. You can also use the internal IP address to access an instance from another instance within the HAWQ VPC.

    You may choose to locate your Ambari node on a separate subnet in the VPC. Both a public IP address for the instance and an Internet gateway configured for the EC2 VPC are required to access the Ambari instance from an external source and for the instance to access the Internet.

    -

    Ensure your Ambari and HAWQ master instances are each assigned a public IP address for external and internet access. We recommend you also assign an Elastic IP Address to the HAWQ master instance.

    +

    Ensure your Ambari and HAWQ master instances are each assigned a public IP address for external and internet access. Also assign an Elastic IP Address to the HAWQ master instance.

    Configure Security Groups

    A security group is a set of rules that control network traffic to and from your HAWQ instance. One or more rules may be associated with a security group, and one or more security groups may be associated with an instance.

    -

    To configure HAWQ communication between nodes in the HAWQ cluster, include and open the following ports in the appropriate security group for the HAWQ master and segment nodes:

    +

    To configure HAWQ communication between nodes in a HAWQ cluster, include and open appropriate ports in the security group for the HAWQ master and segment nodes. For example, if you have a single VPC, you may create a single security group named hawq-cluster-sg for your cluster, and configure this security group to open the following ports:

    + + - - + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    PortProtocol ApplicationSource
    22ssh - secure connect to other hosts-1ICMPpinghawq-cluster-sg
    5432TCPHAWQ/Postgrescidr 0.0.0.0/0
    50700TCPHDFS NameNodecidr 0.0.0.0/0
    0-65535TCPssh, HAWQ segment communicationhawq-cluster-sg
    0-65535UDPHAWQ segment communicationhawq-cluster-sg
    -

    To allow access to/from a source external to the Ambari management node, include and open the following ports in an appropriate security group for your Ambari node:

    +

    This configuration allows communication between the HAWQ segment nodes. It also allows ping on and ssh between all instances in the hawq-cluster-sg security group. The HAWQ master and HDFS NameNode ports are open with the above settings.

    + +

    Open and/or restrict these and any additional ports required for your HAWQ deployment environment.

    + +

    Include and open the following ports in an appropriate security group for your Ambari node to allow access to/from a source external to this node:

    + - + + - + +
    PortProtocol Application
    22ssh - secure connect to other hostsTCPssh - secure shell to/from other hosts
    8080Ambari - HAWQ admin/config web consoleTCPAmbari - HAWQ admininstration web console
    @@ -1134,19 +1194,21 @@

    A key pair for an EC2 instance consists of a public key that AWS stores, and a private key file that you maintain. Together, they allow you to connect to your instance securely. The private key file name typically has a .pem suffix.

    -

    This example logs into an into EC2 instance from an external location with the private key file my-test.pem as user user1. In this example, the instance is configured with the public IP address 192.0.2.0 and the private key file resides in the current directory.

    -
    $ ssh -i my-test.pem user1@192.0.2.0
    +

    The following example logs into an into EC2 instance from an external location with the private key file my-test.pem as user user1. In this example, the instance is configured with the public IP address 192.0.2.0 and the private key file resides in the current directory.

    +
    $ ssh -i ./my-test.pem user1@192.0.2.0
     

    Additional HAWQ Considerations

    -

    After launching your HAWQ instance, you will connect to and configure the instance. The Instances page of the EC2 Console lists the running instances and their associated network access information.

    +

    After launching your HAWQ instance, you will connect to and configure the instance. The Instances page of the EC2 Console lists the running instances and their associated network access information.

    -

    Before installing HAWQ, set up the EC2 instances as you would local host server machines. Configure the host operating system, configure host network information (for example, update the /etc/hosts file), set operating system parameters, and install operating system packages. For information about how to prepare your operating system environment for HAWQ, see Apache HAWQ System Requirements and Select HAWQ Host Machines.

    +

    Before installing HAWQ, set up the EC2 instances as you would local host server machines. Configure the host operating system, configure host network information (for example, update the /etc/hosts file), set operating system parameters, and install operating system packages. Apache HAWQ System Requirements and Select HAWQ Host Machines provide the information necessary to prepare your operating system environment for HAWQ.

    Passwordless SSH Configuration

    -

    HAWQ hosts will be configured during the installation process to use passwordless SSH for intra-cluster communications. Temporary password-based authentication must be enabled on each HAWQ host in preparation for this configuration. Password authentication is typically disabled by default in cloud images. Update the cloud configuration in /etc/cloud/cloud.cfg to enable password authentication in your AMI(s). Set ssh_pwauth: True in this file. If desired, disable password authentication after HAWQ installation by setting the property back to False.

    +

    HAWQ hosts are configured during the installation process to use passwordless SSH for intra-cluster communications. Temporary password-based authentication must be enabled on each HAWQ host in preparation for this configuration.

    + +

    Password authentication is typically disabled by default in cloud images. Update the cloud configuration in /etc/cloud/cloud.cfg to enable password authentication in your AMI(s). Set ssh_pwauth: True in this configuration file. If desired, set the property back to False after HAWQ installation to disable password authentication.

    References

    http://git-wip-us.apache.org/repos/asf/incubator-hawq-site/blob/cce3ea48/docs/userguide/2.1.0.0-incubating/install/select-hosts.html ---------------------------------------------------------------------- diff --git a/docs/userguide/2.1.0.0-incubating/install/select-hosts.html b/docs/userguide/2.1.0.0-incubating/install/select-hosts.html index 91d6fdf..317f7fd 100644 --- a/docs/userguide/2.1.0.0-incubating/install/select-hosts.html +++ b/docs/userguide/2.1.0.0-incubating/install/select-hosts.html @@ -129,11 +129,13 @@
  • Introducing the HAWQ Operating Environment
  • -
  • +
  • Managing HAWQ Using Ambari -
  • -
  • - Using the Ambari REST API +
  • Starting and Stopping HAWQ @@ -413,6 +415,7 @@
  • Accessing Hive Data
  • Accessing HBase Data
  • Accessing JSON Data
  • +
  • Writing Data to HDFS
  • Using Profiles to Read and Write Data
  • PXF External Tables and API
  • Troubleshooting PXF
  • @@ -740,6 +743,7 @@