impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4397,IMPALA-3259: reduce codegen time and memory
Date Thu, 17 Nov 2016 03:02:36 GMT
Michael Ho has posted comments on this change.

Change subject: IMPALA-4397,IMPALA-3259: reduce codegen time and memory
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4956/4/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

PS4, Line 190: CODEGEN_WRITE_TUPLES_MAX_COLUMNS = 250;
> I'd just be picking an arbitrary threshold in that case too, since I don't 
I agree that it's more actionable if it's based on number of tuples.

However, isn't the longer compilation + optimization time with too many tuples a direct result
of having large functions ? If so, instruction count seems to be a more robust estimate. For
example, the generated functions can be different if we change the codegen function. In that
case this threshold may not hold anymore.

This threshold really shouldn't be arbitrary. The ideal case is that the planner can estimate
a threshold of instruction count in which the cost of codegen is less than interpretation.

Also, it seems that the planner should have all the information needed to make the call. Is
there some unexpected difficulty for doing it in the planner ?


http://gerrit.cloudera.org:8080/#/c/4956/4/fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
File fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java:

PS4, Line 823: if (!hasGrouping) {
> I disable it for this particular exchange. This is a proactive thing only r
This isn't exactly what I had in mind. If you look at ExchangeNode::Codegen(), we only codegen
when is_mergeing_ is true. This is pretty much the same situation as the Agg node here. So,
it would be nice to follow the same pattern for exchange node and convert if (is_merging_)
to DCHECK(is_merging_);


-- 
To view, visit http://gerrit.cloudera.org:8080/4956
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id10015b49da182cb181a653ac8464b4a18b71091
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message