hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lei Chang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HAWQ-1255) Looks "segment size with penalty" number in "explain analyze" not correct
Date Thu, 05 Jan 2017 01:35:58 GMT
Lei Chang created HAWQ-1255:
-------------------------------

             Summary: Looks "segment size with penalty" number in "explain analyze" not correct
                 Key: HAWQ-1255
                 URL: https://issues.apache.org/jira/browse/HAWQ-1255
             Project: Apache HAWQ
          Issue Type: Bug
          Components: Query Execution
            Reporter: Lei Chang
            Assignee: Lei Chang



"segment size" is about 500MB, while "segment size with penalty" is about 100MB. Looks not
reasonable.

How to reproduce:
on laptop, 1G tpch data, lineitem table is created as hash distributed with 2 buckets, and
orders table is randomly.


```
postgres=# explain analyze SELECT l_orderkey, count(l_quantity)                          
                                                               FROM lineitem_b2, orders  
                                                                                         
                                    WHERE l_orderkey = o_orderkey                        
                                                                                         
         GROUP BY l_orderkey;
                                                                                         
                                                                                         
                                QUERY PLAN                                               
                                                                                         
                                                                          
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Gather Motion 2:1  (slice2; segments: 2)  (cost=291580.96..318527.67 rows=1230576 width=16)
   Rows out:  Avg 1500000.0 rows x 1 workers at destination.  Max/Last(seg-1:changlei.local/seg-1:changlei.local)
1500000/1500000 rows with 2209/2209 ms to first row, 2577/2577 ms to end, start offset by
1.429/1.429 ms.
   ->  HashAggregate  (cost=291580.96..318527.67 rows=615288 width=16)
         Group By: lineitem_b2.l_orderkey
         Rows out:  Avg 750000.0 rows x 2 workers.  Max/Last(seg1:changlei.local/seg1:changlei.local)
750000/750000 rows with 2243/2243 ms to first row, 2498/2498 ms to end, start offset by 2.615/2.615
ms.
         Executor memory:  56282K bytes avg, 56282K bytes max (seg1:changlei.local).
         ->  Hash Join  (cost=70069.00..250010.38 rows=3000608 width=15)
               Hash Cond: lineitem_b2.l_orderkey = orders.o_orderkey
               Rows out:  Avg 3000607.5 rows x 2 workers.  Max/Last(seg0:changlei.local/seg1:changlei.local)
3001300/2999915 rows with 350/350 ms to first row, 1611/1645 ms to end, start offset by 3.819/3.816
ms.
               Executor memory:  49153K bytes avg, 49153K bytes max (seg1:changlei.local).
               Work_mem used:  23438K bytes avg, 23438K bytes max (seg1:changlei.local). Workfile:
(0 spilling, 0 reused)
               (seg0)   Hash chain length 1.7 avg, 3 max, using 434205 of 524341 buckets.
               ->  Append-only Scan on lineitem_b2  (cost=0.00..89923.15 rows=3000608 width=15)
                     Rows out:  Avg 3000607.5 rows x 2 workers.  Max/Last(seg0:changlei.local/seg1:changlei.local)
3001300/2999915 rows with 4.460/4.757 ms to first row, 546/581 ms to end, start offset by
350/349 ms.
               ->  Hash  (cost=51319.00..51319.00 rows=750000 width=8)
                     Rows in:  Avg 750000.0 rows x 2 workers.  Max/Last(seg1:changlei.local/seg0:changlei.local)
750000/750000 rows with 341/344 ms to end, start offset by 8.114/5.610 ms.
                     ->  Redistribute Motion 2:2  (slice1; segments: 2)  (cost=0.00..51319.00
rows=750000 width=8)
                           Hash Key: orders.o_orderkey
                           Rows out:  Avg 750000.0 rows x 2 workers at destination.  Max/Last(seg1:changlei.local/seg0:changlei.local)
750000/750000 rows with 0.052/2.461 ms to first row, 207/207 ms to end, start offset by 8.114/5.611
ms.
                           ->  Append-only Scan on orders  (cost=0.00..21319.00 rows=750000
width=8)
                                 Rows out:  Avg 750000.0 rows x 2 workers.  Max/Last(seg1:changlei.local/seg0:changlei.local)
750000/750000 rows with 4.773/4.987 ms to first row, 166/171 ms to end, start offset by 2.911/2.697
ms.
 Slice statistics:
   (slice0)    Executor memory: 281K bytes.
   (slice1)    Executor memory: 319K bytes avg x 2 workers, 319K bytes max (seg1:changlei.local).
   (slice2)    Executor memory: 105773K bytes avg x 2 workers, 105773K bytes max (seg1:changlei.local).
 Work_mem: 23438K bytes max.
 Statement statistics:
   Memory used: 262144K bytes
 Settings:  default_hash_table_bucket_number=2
 Dispatcher statistics:
   executors used(total/cached/new connection): (4/4/0); dispatcher time(total/connection/dispatch
data): (0.171 ms/0.000 ms/0.048 ms).
   dispatch data time(max/min/avg): (0.030 ms/0.004 ms/0.011 ms); consume executor data time(max/min/avg):
(0.014 ms/0.009 ms/0.011 ms); free executor time(max/min/avg): (0.000 ms/0.000 ms/0.000 ms).
 Data locality statistics:
   data locality ratio: 1.000; virtual segment number: 2; different host number: 1; virtual
segment number per host(avg/min/max): (2/2/2); segment size(avg/min/max): (593580028.000 B/593495232
B/593664824 B); segment size with penalty(avg/min/max): (103526336.000 B/103517072 B/103535600
B); continuity(avg/min/max): (1.000/1.000/1.000); DFS metadatacache: 0.094 ms; resource allocation:
4.503 ms; datalocality calculation: 1.493 ms.
 Total runtime: 2660.146 ms
(34 rows)
```



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message