hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jiali Yao <j...@pivotal.io>
Subject Re: Question on hawq_rm_nvseg_perquery_limit
Date Wed, 13 Jul 2016 06:43:40 GMT
+1 for detail explanation.
One more, normally we do not suggest that default_hash_table_bucket_number
is greater than hawq_rm_nvseg_perquery_limit(512).
When initing large cluster, the default_hash_table_bucket_number will be
adjusted accordingly.  If default_hash_table_bucket_number >
hawq_rm_nvseg_perquery_limit, it will be adjusted to (
hawq_rm_nvseg_perquery_limit / hostnumber ) * hostnumber.
If the cluster is expanded, it should also need to be set properly.

Jiali


On Wed, Jul 13, 2016 at 1:40 PM, Yi Jin <yjin@pivotal.io> wrote:

> Hi Vineet,
>
> Some my comment.
>
> For question 1.
> Yes,
> perquery_limit is introduced mainly for restrict resource usage in large
> scale cluster; perquery_perseg_limit is to avoid allocating too many
> processes in one segment, which may cause serious performance issue. So,
> two gucs are for different performance aspects. Along with the variation of
> cluster scale, one of the two limits actually takes effect. We dont have to
> let both active for resource allocation.
>
> For question 2.
>
> In fact, perquery_perseg_limit is a general resource restriction for all
> queries not only hash table queries and external table queries, this is why
> this guc is not merged with another one. For example, when we run some
> queries upon random distributed tables, it does not make sense to let
> resource manager refer a guc for hash table.
>
> For the last topic item.
>
> In my opinion, it is not necessary to adjust hawq_rm_nvseg_perquery_limit,
> say, we just need to leave it unchanged and actually not active until we
> really want to run a large-scale HAWQ cluster, for example, 100+ nodes.
>
> Best,
> Yi
>
> On Wed, Jul 13, 2016 at 1:18 PM, Vineet Goel <vvineet@apache.org> wrote:
>
> > Hi all,
> >
> > I’m trying to document some GUC usage in detail and have questions on
> > hawq_rm_nvseg_perquery_limit and hawq_rm_nvseg_perquery_perseg_limit
> > tuning.
> >
> > *hawq_rm_nvseg_perquery_limit* = (default value = 512) . Let’s call it
> > *perquery_limit* in short.
> > *hawq_rm_nvseg_perquery_perseg_limit* (default value = 6) . Let’s call it
> > *perquery_perseg_limit* in short.
> >
> >
> > 1) Is there ever any benefit in having perquery_limit *greater than*
> > (perquery_perseg_limit * segment host count) ?
> > For example in a 10-node cluster, HAWQ will never allocate more than (GUC
> > default 6 * 10 =) 60 v-segs, so the perquery_limit default of 512 doesn’t
> > have any effect. It seems perquery_limit overrides (takes effect)
> > perquery_perseg_limit only when it’s value is less than
> > (perquery_perseg_limit * segment host count).
> >
> > Is that the correct assumption? That would make sense, as users may want
> to
> > keep a check on how much processing a single query can take up (that
> > implies that the limit must be lower than the total possible v-segs). Or,
> > it may make sense in large clusters (100-nodes or more) where we need to
> > limit the pressure on HDFS.
> >
> >
> > 2) Now, if the purpose of hawq_rm_nvseg_perquery_limit is to keep a check
> > on single query resource usage (by limiting the # of v-segs), doesn’t if
> > affect default_hash_table_bucket_number because queries will fail when
> > *default_hash_table_bucket_number* is greater than
> > hawq_rm_nvseg_perquery_limit ? In that case, the purpose of
> > hawq_rm_nvseg_perquery_limit conflicts with the ability to run queries on
> > HASH dist tables. This then means that tuning
> hawq_rm_nvseg_perquery_limit
> > down is not a good idea, which seems conflicting to the purpose of the
> GUC
> > (in relation to other GUC).
> >
> >
> > Perhaps someone can provide some examples on *how and when would you
> > tune hawq_rm_nvseg_perquery_limit* in this 10-node example:
> >
> > *Defaults on a 10-node cluster are:*
> > a) *hawq_rm_nvseg_perquery_perseg_limit* = 6 (hence ability to spin up 6
> *
> > 10 = 60 total v-segs for random tables)
> > b) *hawq_rm_nvseg_perquery_limit* = 512 (but HAWQ will never dispatch
> more
> > than 60 v-segs on random table, so value of 512 does not seem practical)
> > c) *default_hash_table_bucket_number* = 60 (6 * 10)
> >
> >
> >
> > Thanks
> > Vineet
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message