Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C134C200BCC for ; Tue, 15 Nov 2016 01:17:33 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id BFBD5160B0D; Tue, 15 Nov 2016 00:17:33 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 14615160B06 for ; Tue, 15 Nov 2016 01:17:32 +0100 (CET) Received: (qmail 31905 invoked by uid 500); 15 Nov 2016 00:17:32 -0000 Mailing-List: contact dev-help@hawq.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hawq.incubator.apache.org Delivered-To: mailing list dev@hawq.incubator.apache.org Received: (qmail 31894 invoked by uid 99); 15 Nov 2016 00:17:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Nov 2016 00:17:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 6EA40C028B for ; Tue, 15 Nov 2016 00:17:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -7.019 X-Spam-Level: X-Spam-Status: No, score=-7.019 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id nxqDyycE-y4T for ; Tue, 15 Nov 2016 00:17:30 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id A96585FAC4 for ; Tue, 15 Nov 2016 00:17:29 +0000 (UTC) Received: (qmail 31726 invoked by uid 99); 15 Nov 2016 00:17:15 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 15 Nov 2016 00:17:15 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 2AD5AE09B6; Tue, 15 Nov 2016 00:17:15 +0000 (UTC) From: dyozie To: dev@hawq.incubator.apache.org Reply-To: dev@hawq.incubator.apache.org References: In-Reply-To: Subject: [GitHub] incubator-hawq-docs pull request #60: HAWQ-1151 - add ambari procedures Content-Type: text/plain Message-Id: <20161115001715.2AD5AE09B6@git1-us-west.apache.org> Date: Tue, 15 Nov 2016 00:17:15 +0000 (UTC) archived-at: Tue, 15 Nov 2016 00:17:33 -0000 Github user dyozie commented on a diff in the pull request: https://github.com/apache/incubator-hawq-docs/pull/60#discussion_r87922081 --- Diff: ddl/ddl-table.html.md.erb --- @@ -93,14 +93,14 @@ For any specific query, the first four factors are fixed values, while the confi The `bucketnum` for a hash table specifies the number of hash buckets to be used in creating virtual segments. A HASH distributed table is created with `default_hash_table_bucket_number` buckets. The default bucket value can be changed in session level or in the `CREATE TABLE` DDL by using the `bucketnum` storage parameter. -When initializing a cluster, you can use the `hawq init --bucket_number` parameter to explcitly set the default bucket number \(`default_hash_table_bucket_number`\). +In an Ambari-managed HAWQ cluster, the default bucket number \(`default_hash_table_bucket_number`\) is derived from the number of segment nodes. In command-line-managed HAWQ environments, you can use the `--bucket_number` option of `hawq init` to explicitly set `default_hash_table_bucket_number` during cluster initialization. -**Note:** For best performance with large tables, the number of buckets should not exceed the value of the `default_hash_table_bucket_number` parameter. Small tables can use one segment node, `with bucketnum=1`. For larger tables, the bucketnum is set to a multiple of the number of segment nodes, for the best load balancing on different segment nodes. The elastic runtime will attempt to find the optimal number of buckets for the number of nodes being processed. Larger tables need more virtual segments , and hence use larger numbers of buckets. +**Note:** For best performance with large tables, the number of buckets should not exceed the value of the `default_hash_table_bucket_number` parameter. Small tables can use one segment node, `WITH bucketnum=1`. For larger tables, the `bucketnum` is set to a multiple of the number of segment nodes, for the best load balancing on different segment nodes. The elastic runtime will attempt to find the optimal number of buckets for the number of nodes being processed. Larger tables need more virtual segments , and hence use larger numbers of buckets. --- End diff -- Might as well fix (remove) the space before the comma here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---