Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C07EF200B3C for ; Wed, 13 Jul 2016 09:16:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id BD701160A6E; Wed, 13 Jul 2016 07:16:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DD60C160A57 for ; Wed, 13 Jul 2016 09:16:45 +0200 (CEST) Received: (qmail 33366 invoked by uid 500); 13 Jul 2016 07:16:45 -0000 Mailing-List: contact dev-help@hawq.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hawq.incubator.apache.org Delivered-To: mailing list dev@hawq.incubator.apache.org Received: (qmail 33355 invoked by uid 99); 13 Jul 2016 07:16:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Jul 2016 07:16:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 5AA5D1A5C9B for ; Wed, 13 Jul 2016 07:16:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.446 X-Spam-Level: X-Spam-Status: No, score=-3.446 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id HD6RcS4LZhqp for ; Wed, 13 Jul 2016 07:16:42 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with SMTP id BCA495F4E9 for ; Wed, 13 Jul 2016 07:16:41 +0000 (UTC) Received: (qmail 33350 invoked by uid 99); 13 Jul 2016 07:16:41 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Jul 2016 07:16:41 +0000 Received: from mail-oi0-f43.google.com (mail-oi0-f43.google.com [209.85.218.43]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id CE3611A0156 for ; Wed, 13 Jul 2016 07:16:40 +0000 (UTC) Received: by mail-oi0-f43.google.com with SMTP id r2so57659453oih.2 for ; Wed, 13 Jul 2016 00:16:40 -0700 (PDT) X-Gm-Message-State: ALyK8tIpwt3NGskrcw5apBx7thzayziaWVFZxn2cghurwz5Hfv7HIxSvk/Y38CIPCQYfb9GI0WQGMzIrIKNHBCGj X-Received: by 10.157.10.16 with SMTP id 16mr4173980otg.131.1468394199728; Wed, 13 Jul 2016 00:16:39 -0700 (PDT) MIME-Version: 1.0 Received: by 10.202.90.197 with HTTP; Wed, 13 Jul 2016 00:16:39 -0700 (PDT) In-Reply-To: References: From: Vineet Goel Date: Wed, 13 Jul 2016 00:16:39 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Question on hawq_rm_nvseg_perquery_limit To: dev@hawq.incubator.apache.org Content-Type: multipart/alternative; boundary=001a113bfd0caa8e2105377f2afa archived-at: Wed, 13 Jul 2016 07:16:46 -0000 --001a113bfd0caa8e2105377f2afa Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable This leads me to another question on Apache Ambari UI integration. It seems the need to tune hawq_rm_nvseg_perquery_limit is minimal, as we seem to prescribe a limit of 512 regardless of cluster size. If that's the case, two options come to mind: 1) Either the "default" hawq_rm_nvseg_perquery_limit should be the lower value between (6 * segment host count) and 512. This way, it's less confusing to users and there is a logic behind the value. 2) Or, the parameter should not be exposed on the UI, leaving the default to 512. When/why would a user want to change this value? Thoughts? Vineet On Tue, Jul 12, 2016 at 11:51 PM, Hubert Zhang wrote: > +1 with Yi's answer. > Vseg numbers are controlled by Resource Negotiator(a module before > planner), all the vseg related GUCs will affect the behaviour of RN, som= e > of them will also affect Resource Manager. > To be specific, hawq_rm_nvseg_perquery_limit and > hawq_rm_nvseg_perquery_perseg_limit > are both considered by Resource Negotiator(RN) and Resource Manager(RM), > while default_hash_table_bucket_number is only considered by RN. > As a result, suppose default_hash_table_bucket_number =3D 60, query like > "select * from hash_table" will request #60 vsegs in RN and if > hawq_rm_nvseg_perquery_limit > is less than 60, RM will not able to allocate 60 vsegs. > > So we need to ensure default_hash_table_bucket_number is less than the > capacity of RM. > > On Wed, Jul 13, 2016 at 1:40 PM, Yi Jin wrote: > > > Hi Vineet, > > > > Some my comment. > > > > For question 1. > > Yes, > > perquery_limit is introduced mainly for restrict resource usage in larg= e > > scale cluster; perquery_perseg_limit is to avoid allocating too many > > processes in one segment, which may cause serious performance issue. So= , > > two gucs are for different performance aspects. Along with the variatio= n > of > > cluster scale, one of the two limits actually takes effect. We dont hav= e > to > > let both active for resource allocation. > > > > For question 2. > > > > In fact, perquery_perseg_limit is a general resource restriction for al= l > > queries not only hash table queries and external table queries, this is > why > > this guc is not merged with another one. For example, when we run some > > queries upon random distributed tables, it does not make sense to let > > resource manager refer a guc for hash table. > > > > For the last topic item. > > > > In my opinion, it is not necessary to adjust > hawq_rm_nvseg_perquery_limit, > > say, we just need to leave it unchanged and actually not active until w= e > > really want to run a large-scale HAWQ cluster, for example, 100+ nodes. > > > > Best, > > Yi > > > > On Wed, Jul 13, 2016 at 1:18 PM, Vineet Goel wrote= : > > > > > Hi all, > > > > > > I=E2=80=99m trying to document some GUC usage in detail and have ques= tions on > > > hawq_rm_nvseg_perquery_limit and hawq_rm_nvseg_perquery_perseg_limit > > > tuning. > > > > > > *hawq_rm_nvseg_perquery_limit* =3D (default value =3D 512) . Let=E2= =80=99s call it > > > *perquery_limit* in short. > > > *hawq_rm_nvseg_perquery_perseg_limit* (default value =3D 6) . Let=E2= =80=99s call > it > > > *perquery_perseg_limit* in short. > > > > > > > > > 1) Is there ever any benefit in having perquery_limit *greater than* > > > (perquery_perseg_limit * segment host count) ? > > > For example in a 10-node cluster, HAWQ will never allocate more than > (GUC > > > default 6 * 10 =3D) 60 v-segs, so the perquery_limit default of 512 > doesn=E2=80=99t > > > have any effect. It seems perquery_limit overrides (takes effect) > > > perquery_perseg_limit only when it=E2=80=99s value is less than > > > (perquery_perseg_limit * segment host count). > > > > > > Is that the correct assumption? That would make sense, as users may > want > > to > > > keep a check on how much processing a single query can take up (that > > > implies that the limit must be lower than the total possible v-segs). > Or, > > > it may make sense in large clusters (100-nodes or more) where we need > to > > > limit the pressure on HDFS. > > > > > > > > > 2) Now, if the purpose of hawq_rm_nvseg_perquery_limit is to keep a > check > > > on single query resource usage (by limiting the # of v-segs), doesn= =E2=80=99t > if > > > affect default_hash_table_bucket_number because queries will fail whe= n > > > *default_hash_table_bucket_number* is greater than > > > hawq_rm_nvseg_perquery_limit ? In that case, the purpose of > > > hawq_rm_nvseg_perquery_limit conflicts with the ability to run querie= s > on > > > HASH dist tables. This then means that tuning > > hawq_rm_nvseg_perquery_limit > > > down is not a good idea, which seems conflicting to the purpose of th= e > > GUC > > > (in relation to other GUC). > > > > > > > > > Perhaps someone can provide some examples on *how and when would you > > > tune hawq_rm_nvseg_perquery_limit* in this 10-node example: > > > > > > *Defaults on a 10-node cluster are:* > > > a) *hawq_rm_nvseg_perquery_perseg_limit* =3D 6 (hence ability to spin= up > 6 > > * > > > 10 =3D 60 total v-segs for random tables) > > > b) *hawq_rm_nvseg_perquery_limit* =3D 512 (but HAWQ will never dispat= ch > > more > > > than 60 v-segs on random table, so value of 512 does not seem > practical) > > > c) *default_hash_table_bucket_number* =3D 60 (6 * 10) > > > > > > > > > > > > Thanks > > > Vineet > > > > > > > > > -- > Thanks > > Hubert Zhang > --001a113bfd0caa8e2105377f2afa--