impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner
Date Thu, 28 Jul 2016 21:49:51 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner
......................................................................


Patch Set 3:

(3 comments)

If you have results showing no improvement on TPC-H, that seems good to me.  Scans aren't
the bottleneck for a lot of those queries because they are multithreaded, unlike other operators.
Scans will become a real bottleneck once we multithread other operators (or even if we have
concurrent queries).

I suspect you'll see an improvement if you look at MaterializeTupleTime in the profile, or
if you set num_scanner_threads=1.

http://gerrit.cloudera.org:8080/#/c/3774/3/be/src/exec/hdfs-parquet-scanner-ir.cc
File be/src/exec/hdfs-parquet-scanner-ir.cc:

Line 1: // Copyright 2016 Cloudera Inc.
We'll have to update the license header to the Apache one.


http://gerrit.cloudera.org:8080/#/c/3774/3/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

Line 446:   int TransferScratchTuples(int tuple_size, bool has_filters);
Maybe document that these are arguments so that they can be replaced by codegen.


http://gerrit.cloudera.org:8080/#/c/3774/3/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

Line 689:     if (!s.ok()) {
It would be good to rework this so that:
1. We always show "enabled" or "disabled" by calling AddCodegenExecOption for all file types.
2. We include the file type in the exec option (you can do this by passing a string as the
third argument to AddCodegenExecOpen()). We may codegen multiple file types in a scan, so
it's important to know which one failed.

It's kind of annoying since we don't have a status on all code paths above, but I think it
will pay off.


-- 
To view, visit http://gerrit.cloudera.org:8080/3774
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message