impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taras Bobrovytsky (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4883: Union Codegen
Date Fri, 31 Mar 2017 02:42:27 GMT
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4883: Union Codegen
......................................................................


Patch Set 5:

I reran the benchmark on patch 5 on a larger table where we select only 1 column:
    SELECT
      COUNT(c)
    FROM (
      select fnv_hash(ss_sold_time_sk) c from tpcds_10_parquet.store_sales_unpartitioned_big
      union all
      select fnv_hash(ss_sold_time_sk) c from tpcds_10_parquet.store_sales_unpartitioned_big
      union all
      select fnv_hash(ss_sold_time_sk) c from tpcds_10_parquet.store_sales_unpartitioned_big
      union all
      select fnv_hash(ss_sold_time_sk) c from tpcds_10_parquet.store_sales_unpartitioned_big
    ) t

Before: 17.6s
After: 9.98s

Not a huge difference. I think the bottleneck is scanning (not union), that's why the improvement
is not as big. Maybe the difference will be more significant on a large cluster?

-- 
To view, visit http://gerrit.cloudera.org:8080/6459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib4107d27582ff5416172810364a6e76d3d93c439
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: No

Mime
View raw message