impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner
Date Thu, 28 Jul 2016 21:49:51 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3629: Codegen TransferScratchTuples() in hdfs-parquet-scanner

Patch Set 3:


If you have results showing no improvement on TPC-H, that seems good to me.  Scans aren't
the bottleneck for a lot of those queries because they are multithreaded, unlike other operators.
Scans will become a real bottleneck once we multithread other operators (or even if we have
concurrent queries).

I suspect you'll see an improvement if you look at MaterializeTupleTime in the profile, or
if you set num_scanner_threads=1.
File be/src/exec/

Line 1: // Copyright 2016 Cloudera Inc.
We'll have to update the license header to the Apache one.
File be/src/exec/hdfs-parquet-scanner.h:

Line 446:   int TransferScratchTuples(int tuple_size, bool has_filters);
Maybe document that these are arguments so that they can be replaced by codegen.
File be/src/exec/

Line 689:     if (!s.ok()) {
It would be good to rework this so that:
1. We always show "enabled" or "disabled" by calling AddCodegenExecOption for all file types.
2. We include the file type in the exec option (you can do this by passing a string as the
third argument to AddCodegenExecOpen()). We may codegen multiple file types in a scan, so
it's important to know which one failed.

It's kind of annoying since we don't have a status on all code paths above, but I think it
will pay off.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic327e437c7cd2b3f92cdb11c1e907bfee2d44ee8
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall <>
Gerrit-Reviewer: Thomas Tauber-Marshall <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message