impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5844: use a MemPool for expr result allocations
Date Wed, 04 Oct 2017 00:41:04 GMT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8025 )

Change subject: IMPALA-5844: use a MemPool for expr result allocations
......................................................................


Patch Set 11:

(17 comments)

http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/exec-node.h@379
PS11, Line 379:   Status QueryMaintenance(RuntimeState* state) WARN_UNUSED_RESULT;
> I still think we should get rid of this but okay to leave in this change.
Yeah I think this is another step towards IMPALA-2399 since there's less encapsulated in this
function.

I think it's a separate logical change that probably needs some thought.


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/hash-table.h@396
PS11, Line 396: Cloes
> Close
Done


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/hash-table.h@525
PS11, Line 525:   /// results of expr evaluation.
> Not owned? Which also means the exec node will do the maintenance on them, 
Done


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h
File be/src/exec/partitioned-aggregation-node.h:

http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@214
PS11, Line 214: if this is a non-grouping aggregation used
              :   /// for the non-grouping cases
> did that get garbled?
Done


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@219
PS11, Line 219:   /// are allocated from 'expr_perm_pool_' and 'expr_results_pool_' respectively.
> and what about for grouping case?
Reworked this whole comment to be clearer.


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@419
PS11, Line 419: expr_managed
> should this be talking about permanent?
Done


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@420
PS11, Line 420: per partition
> I don't get that comment, given this is a single pool.
It belongs to the partition though. This comment really needs some fixing though - rewrote
it.


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@421
PS11, Line 421: Local allocations
> results
Done


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.h@423
PS11, Line 423: agg_fn_pool
Renamed this since it's only permanent allocatoins.


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/partitioned-aggregation-node.cc@600
PS11, Line 600: if (++i % 1024 == 0)
> is that even worth it?
Probably not


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/exec/scanner-context.h@337
PS11, Line 337:   /// evaluation to prevent memory from accumulating.
> who owns this?
Mentioned the two cases, since the only reason this is this complicated is the need to support
the multi-threads scan node.

The filter_ctxs_ are created in HdfsScanNode::ScannerThread()  in the multi-threaded case
and they allocate memory from this pool. So this pool needs to exists at that point in time,
but the scanner implementation also needs access to it so that it can clear it.

I don't think this *needs* to be that complicated but it seems like it would require a non-trivial
restructure of the logic around filter_ctxs_ to get it simpler.


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/runtime/sorter.h
File be/src/runtime/sorter.h:

http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/runtime/sorter.h@97
PS11, Line 97: (returns true if lhs < rhs)
> not sure that still makes sense. what returns true?
That's true, it doesn't make sense now.


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/runtime/sorter.h@202
PS11, Line 202: intermediate
> not just the intermediate results, but also the final results of expression
You're right, I was thinking it was only used for the comparator but it's of course also used
for &sort_tuple_expr_evals_


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/runtime/sorter.h@199
PS11, Line 199:   /// MemPool for allocating data structures used by expression evaluators
in the sorter.
              :   MemPool expr_perm_pool_;
              : 
              :   /// MemPool for allocations that hold intermediate results of expression
evaluation in
              :   /// the sorter. Cleared periodically during sorting to prevent memory accumulating.
              :   MemPool expr_results_pool_;
              : 
              :   /// In memory sorter and less-than comparator.
              :   TupleRowComparator compare_less_than_;
> in other places, we have indirection (i.e. scoped_ptr) for these fields. is
In TopNNode and ExchangeNode the comparator can't be instantiated until Prepare() because
its constructor arguments aren't available.


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/udf/udf.h
File be/src/udf/udf.h:

http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/udf/udf.h@350
PS11, Line 350: For allocations that back the intermediate results of aggregation
              : /// returned by Init(), Update() and Merge() should be allocated by
> a bit garbled
Tried to improve it.


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/udf/udf.h@616
PS11, Line 616: temporary
> this is kind of new terminology. above i think you said "temporary results"
Reworded this to avoid introducing vague terminology. I think this comment was a bit unclear
on other points so clarified it and fixed the behaviour when new_len < len (which was actually
incorrect).


http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/util/tuple-row-compare.h
File be/src/util/tuple-row-compare.h:

http://gerrit.cloudera.org:8080/#/c/8025/11/be/src/util/tuple-row-compare.h@77
PS11, Line 77: Any
             :   /// the evaluators with use
> garbled
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4ba5a7542ed90a49a4b5586c040b5985a7d45b61
Gerrit-Change-Number: 8025
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Wed, 04 Oct 2017 00:41:04 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message