From Vineet Goel <vvin...@apache.org>
Subject Question on hawq_rm_nvseg_perquery_limit
Date Wed, 13 Jul 2016 03:18:14 GMT
Hi all,

I’m trying to document some GUC usage in detail and have questions on
hawq_rm_nvseg_perquery_limit and hawq_rm_nvseg_perquery_perseg_limit tuning.

*hawq_rm_nvseg_perquery_limit* = (default value = 512) . Let’s call it
*perquery_limit* in short.
*hawq_rm_nvseg_perquery_perseg_limit* (default value = 6) . Let’s call it
*perquery_perseg_limit* in short.

1) Is there ever any benefit in having perquery_limit *greater than*
(perquery_perseg_limit * segment host count) ?
For example in a 10-node cluster, HAWQ will never allocate more than (GUC
default 6 * 10 =) 60 v-segs, so the perquery_limit default of 512 doesn’t
have any effect. It seems perquery_limit overrides (takes effect)
perquery_perseg_limit only when it’s value is less than
(perquery_perseg_limit * segment host count).

Is that the correct assumption? That would make sense, as users may want to
keep a check on how much processing a single query can take up (that
implies that the limit must be lower than the total possible v-segs). Or,
it may make sense in large clusters (100-nodes or more) where we need to
limit the pressure on HDFS.

2) Now, if the purpose of hawq_rm_nvseg_perquery_limit is to keep a check
on single query resource usage (by limiting the # of v-segs), doesn’t if
affect default_hash_table_bucket_number because queries will fail when
*default_hash_table_bucket_number* is greater than
hawq_rm_nvseg_perquery_limit ? In that case, the purpose of
hawq_rm_nvseg_perquery_limit conflicts with the ability to run queries on
HASH dist tables. This then means that tuning hawq_rm_nvseg_perquery_limit
down is not a good idea, which seems conflicting to the purpose of the GUC
(in relation to other GUC).

Perhaps someone can provide some examples on *how and when would you
tune hawq_rm_nvseg_perquery_limit* in this 10-node example:

*Defaults on a 10-node cluster are:*
a) *hawq_rm_nvseg_perquery_perseg_limit* = 6 (hence ability to spin up 6 *
10 = 60 total v-segs for random tables)
b) *hawq_rm_nvseg_perquery_limit* = 512 (but HAWQ will never dispatch more
than 60 v-segs on random table, so value of 512 does not seem practical)
c) *default_hash_table_bucket_number* = 60 (6 * 10)


