impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4674: Part 2: port backend exec to BufferPool
Date Mon, 10 Jul 2017 23:36:40 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4674: Part 2: port backend exec to BufferPool
......................................................................


Patch Set 23:

(27 comments)

First batch of comments. Will continue but figure we can pipeline somewhat.

http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

PS23, Line 194: !buffer_pool_client_.is_registered()
I guess that's needed for subplans?


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/analytic-eval-node.h
File be/src/exec/analytic-eval-node.h:

PS23, Line 336: reservation
the stream will try to grow the reservation first, right?
Maybe reword to be more accurate:
When the tuple stream cannot acquire additional reservation, input_stream_ is unpinned...

or something.


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/exec-node.cc
File be/src/exec/exec-node.cc:

Line 209:         &buffer_pool_client_, resource_profile_.min_reservation);
do we somewhere inside there dcheck that the exec-node has at least min_reservation reservation
still?


Line 226:       resource_profile_.spillable_buffer_size < buffer_pool->min_buffer_len())
{
when can that happen?


Line 233:   // to get some reservation.
I'm not sure what that comment is saying.


PS23, Line 240: if (resource_profile_.min_reservation > 0) {
maybe delete that to exercise this path unconditionally


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

PS23, Line 948: 
is that solved or was it not a problem?


PS23, Line 532: Either a pointer to the single tuple of this row 
where does the tuple live in that case?


PS23, Line 605: success
that's kind of confusing given the OK on status means success also. Maybe make that 'got_memory'
or 'got_reservation' (whichever the case is)?


PS23, Line 617:   
blank


PS23, Line 625: Status* status
do we need to pass that indirectly for codegen/perf still? i thought we made status faster
on the success path.


PS23, Line 688: success
same comment


PS23, Line 854: Status* status
same question about status perf. otherwise, seems better to pass HtData**


PS23, Line 862: success
same comment


PS23, Line 879: status
same comment vs DuplicateNode**


PS23, Line 895: tatus* status)
this is opposite all the "bool* success" interfaces. reason for that?


PS23, Line 923: stores_tuples_
this could probably use a comment (the comment above seems to apply to the set of variables
since this one is not dependent on join vs agg, but instead the number of tuples, right?).
Also a short motivation for this optimization would be helpful.  I suppose it's still worthwhile
even with the new BTS format.


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/exec/partitioned-aggregation-node-ir.cc
File be/src/exec/partitioned-aggregation-node-ir.cc:

Line 224:   if (hash_tbl == nullptr) return false; // Hash table was not created - pass through.
why does that happen now? or is it a bug fix?


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/exec-env.cc
File be/src/runtime/exec-env.cc:

PS23, Line 88: page
should we express that as min_buffer_size instead?  maybe easy to understand by users that
way?


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.cc
File be/src/runtime/initial-reservations.cc:

Line 69:   DCHECK(success);
Explain why success is expected (planner computation).


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/initial-reservations.h
File be/src/runtime/initial-reservations.h:

PS23, Line 40:  This creates trackers for initial reservations under those.
This seems similar to a SubReservation, no?


PS23, Line 47: Claim
claim it from where? How is this different than Claim() claiming? also what do we mean by
"peak" here?


PS23, Line 48: that be claimed
garbled


PS23, Line 48: but not
             :   /// returned
why not returned?


PS23, Line 50: Status
explain failure modes


http://gerrit.cloudera.org:8080/#/c/5801/23/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS23, Line 403: 
in order to not regress w.r.t. IMPALA-1575, don't we need to release the initial reservations
when the last fragment instance completes?  i think this shared_ptr was effectively doing
that in the BufferedBlockMgr world, right?


http://gerrit.cloudera.org:8080/#/c/5801/23/fe/src/main/java/org/apache/impala/util/MathUtil.java
File fe/src/main/java/org/apache/impala/util/MathUtil.java:

Line 45: }
look familar


-- 
To view, visit http://gerrit.cloudera.org:8080/5801
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7fc7fe1c04e9dfb1a0c749fb56a5e0f2bf9c6c3e
Gerrit-PatchSet: 23
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message