hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Roberts <jrobe...@pivotal.io>
Subject Re: Random tables loaded from files
Date Fri, 02 Dec 2016 14:30:33 GMT
I'm happy to report that I made an error!  I thought I had set my scripts
to use random distribution for the tpcds schema but I used hash.  That is
why it wasn't behaving as I had expected.

Sorry for the noise.

Jon Roberts
Principal Engineer | jroberts@pivotal.io | 615-426-8661

On Thu, Dec 1, 2016 at 8:53 PM, Hubert Zhang <hzhang@pivotal.io> wrote:

> Hi Jon,
> 1 Your problem: the vseg number for your two queries ("explain analyze
> select count(*) from tpcds.date_dim;"
> "explain analyze select count(*) from tpcds_opt.date_dim;" ) depend on the
> distribution type of your table and table size.
> Could you give us more details about your tables by running "select * from
> gp_distribution_policy where localoid=idofYourTable;"
>
> 2 Questions about cluster expanding.
>     For new created hash tables, the bucket number depends on the current
> value of default_hash_table_bucket_number (GUC), which is recommended to
> be
> changed after cluster expanding. So no side effect on new created hash
> table.
>     For old hash tables, the bucket number is the old value of
> default_hash_table_bucket_number.
> After cluster expanding , for example 16 nodes to 64 nodes, the old value
> of default_hash_table_bucket_number is 16*6=96, the new value of
> default_hash_table_bucket_number
> is 64*6=384.  Query such as "select count(*) from old_hash_table" will only
> use 96 vsegs instead of 384 vsegs. One way is to reload the old hash table,
> The other way is to use random table at the beginning.(This is why we use
> random table as default in HAWQ2.x compared with hash table as default in
> HAWQ1.x)
>
>     For random table, the vseg number is determined by the table size. For
> small tables(such as data size is less than one hdfs block size), only one
> vseg will be used, and for big tables, the upper bound of vsegs number is
> hawq_rm_nvseg_perquery_perseg_limit * #segment.
> Cluster expanding will increase the number of segment(for example 16 nodes
> to 64 nodes, and then queries on a random table will use 4X times number of
> vsegs).
>
> Thanks
> Hubert
>
>
> On Thu, Dec 1, 2016 at 5:16 AM, Jon Roberts <jroberts@pivotal.io> wrote:
>
> > I have a cluster with TPC-DS data and all data is loaded from external
> > tables.  hawq_rm_nvseg_perquery_perseg_limit was set to the default of 6
> > when the data was loaded and I have 10 nodes so the bucketnum = 60 for
> all
> > tables.  All tables are also randomly distributed.
> >
> > In my efforts to optimize performance, I've tried
> > increasing hawq_rm_nvseg_perquery_perseg_limit to utilize more resources
> > to
> > execute the queries.  Unfortunately, this hasn't helped any.
> >
> > I then noticed this:
> >
> > *"Random tables are copied from files: #vseg is a fixed value. #vseg is
> 6,
> > when there are sufficient resources."*
> > http://hdb.docs.pivotal.io/201/hawq/query/query-performance.html#topic38
> >
> > It appears that tables loaded from external tables have a fixed number of
> > vsegs but tables loaded from other internal tables have a dynamic number.
> > For example:
> >
> > --table loaded from an external table
> > gpadmin=# explain analyze select count(*) from tpcds.date_dim;
> >
> >
> >
> >                              QUERY PLAN
> >
> >
> >
> > ------------------------------------------------------------
> > ------------------------------------------------------------
> > -----------------------------------------------
> > ------------------------------------------------------------
> > ------------------------------------------------------------
> > -----------------------------------------------
> > ---------------------------------------------------------------------
> >  Aggregate  (cost=0.00..431.09 rows=1 width=8)
> >    Rows out:  Avg 1.0 rows x 1 workers.
> > Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172-
> > 21-4-229.ec2.internal)
> > 1/1 rows with 16277/16277 ms to end, start off
> > set by 3255/3255 ms.
> >    ->  Gather Motion 60:1  (slice1; segments: 60)  (cost=0.00..431.09
> > rows=1 width=8)
> >          Rows out:  Avg 60.0 rows x 1 workers at destination.
> > Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172-
> > 21-4-229.ec2.internal)
> > 60/60 rows with 39/39 ms
> >  to first row, 16277/16277 ms to end, start offset by 3255/3255 ms.
> >          ->  Aggregate  (cost=0.00..431.09 rows=1 width=8)
> >                Rows out:  Avg 1.0 rows x 60 workers.
> > Max/Last(seg59:ip-172-21-4-235.ec2.internal/seg11:ip-172-
> > 21-4-226.ec2.internal)
> > 1/1 rows with 107/16274 ms to end
> > , start offset by 3257/3257 ms.
> >                ->  Table Scan on date_dim  (cost=0.00..431.09 rows=1218
> > width=1)
> >                      Rows out:  Avg 1217.5 rows x 60 workers.
> > Max/Last(seg51:ip-172-21-4-234.ec2.internal/seg11:ip-172-
> > 21-4-226.ec2.internal)
> > 1242/1232 rows with 5035
> > /16273 ms to end, start offset by 3258/3257 ms.
> >  Slice statistics:
> >    (slice0)    Executor memory: 412K bytes.
> >    (slice1)    Executor memory: 215K bytes avg x 60 workers, 215K bytes
> max
> > (seg59:ip-172-21-4-235.ec2.internal).
> >  Statement statistics:
> >    Memory used: 262144K bytes
> >  Settings:  default_hash_table_bucket_number=60; optimizer=on
> >  Optimizer status: PQO version 1.638
> >  Dispatcher statistics:
> >    executors used(total/cached/new connection): (60/1/59); dispatcher
> > time(total/connection/dispatch data): (3254.181 ms/1480539781639.241
> > ms/0.483 ms).
> >    dispatch data time(max/min/avg): (0.021 ms/0.005 ms/0.007 ms); consume
> > executor data time(max/min/avg): (0.044 ms/0.003 ms/0.009 ms); free
> > executor time(max/min/avg
> > ): (0.000 ms/0.000 ms/0.000 ms).
> >  Data locality statistics:
> >    data locality ratio: 1.000; virtual segment number: 60; different host
> > number: 10; virtual segment number per host(avg/min/max): (6/6/6);
> segment
> > size(avg/min/max):
> >  (186145.950 B/182662 B/189757 B); segment size with
> penalty(avg/min/max):
> > (0.000 B/0 B/0 B); continuity(avg/min/max): (1.000/1.000/1.000); DFS
> > metadatacache: 0.263 ms
> > ; resource allocation: 0.521 ms; datalocality calculation: 0.114 ms.
> >  Total runtime: 19549.798 ms
> > (21 rows)
> >
> > Time: 19651.159 ms
> >
> > This is just a small dimension table too.
> >
> > I then loaded that data from the local table to a new random table.
> >
> > gpadmin=# explain analyze select count(*) from tpcds_opt.date_dim;
> >
> >
> >
> >                                           QUERY PLAN
> >
> >
> >
> >
> > ------------------------------------------------------------
> > ------------------------------------------------------------
> > -----------------------------------------------
> > ------------------------------------------------------------
> > ------------------------------------------------------------
> > -----------------------------------------------
> > ------------------------------------------------------------
> > ----------------------------------
> >  Aggregate  (cost=0.00..436.36 rows=1 width=8)
> >    Rows out:  Avg 1.0 rows x 1 workers.
> > Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172-
> > 21-4-229.ec2.internal)
> > 1/1 rows with 2624/2624 ms to end, start offse
> > t by 1.081/1.081 ms.
> >    ->  Gather Motion 1:1  (slice1; segments: 1)  (cost=0.00..436.36
> rows=1
> > width=8)
> >          Rows out:  Avg 1.0 rows x 1 workers at destination.
> > Max/Last(seg-1:ip-172-21-4-229.ec2.internal/seg-1:ip-172-
> > 21-4-229.ec2.internal)
> > 1/1 rows with 2624/2624 m
> > s to end, start offset by 1.082/1.082 ms.
> >          ->  Aggregate  (cost=0.00..436.36 rows=1 width=8)
> >                Rows out:  Avg 1.0 rows x 1 workers.
> > Max/Last(seg0:ip-172-21-4-225.ec2.internal/seg0:ip-172-21-4-
> > 225.ec2.internal)
> > 1/1 rows with 2621/2621 ms to end, s
> > tart offset by 0.846/0.846 ms.
> >                ->  Table Scan on date_dim  (cost=0.00..436.22 rows=73049
> > width=1)
> >                      Rows out:  Avg 73049.0 rows x 1 workers.
> > Max/Last(seg0:ip-172-21-4-225.ec2.internal/seg0:ip-172-21-4-
> > 225.ec2.internal)
> > 73049/73049 rows with 2.96
> > 6/2.966 ms to first row, 2595/2595 ms to end, start offset by 0.847/0.847
> > ms.
> >  Slice statistics:
> >    (slice0)    Executor memory: 170K bytes.
> >    (slice1)    Executor memory: 343K bytes
> > (seg0:ip-172-21-4-225.ec2.internal).
> >  Statement statistics:
> >    Memory used: 262144K bytes
> >  Settings:  default_hash_table_bucket_number=60; optimizer=on
> >  Optimizer status: PQO version 1.638
> >  Dispatcher statistics:
> >    executors used(total/cached/new connection): (1/1/0); dispatcher
> > time(total/connection/dispatch data): (0.122 ms/0.000 ms/0.015 ms).
> >    dispatch data time(max/min/avg): (0.015 ms/0.015 ms/0.015 ms); consume
> > executor data time(max/min/avg): (0.011 ms/0.011 ms/0.011 ms); free
> > executor time(max/min/avg
> > ): (0.000 ms/0.000 ms/0.000 ms).
> >  Data locality statistics:
> >    data locality ratio: 0.296; virtual segment number: 1; different host
> > number: 1; virtual segment number per host(avg/min/max): (1/1/1); segment
> > size(avg/min/max): (
> > 11168757.000 B/11168757 B/11168757 B); segment size with
> > penalty(avg/min/max): (11247341.000 B/11247341 B/11247341 B);
> > continuity(avg/min/max): (1.000/1.000/1.000); DF
> > S metadatacache: 0.254 ms; resource allocation: 0.387 ms; datalocality
> > calculation: 0.130 ms.
> >  Total runtime: 2627.711 ms
> > (21 rows)
> >
> > Time: 2728.409 ms
> >
> > I'm able to decrease the query execution time of many queries by
> > increasing hawq_rm_nvseg_perquery_perseg_limit but only for the tables
> > loaded from other local tables and not tables loaded from external
> tables.
> > Based on the documentation, this appears to be the expected behavior.
> >
> > - Are there plans to correct this?
> > - What happens if the cluster expands?  Will these random need to be
> > redistributed?
> > - Are there workarounds to this issue?
> >
> > Jon Roberts
> >
>
>
>
> --
> Thanks
>
> Hubert Zhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message