impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5844: use a MemPool for expr result allocations
Date Fri, 29 Sep 2017 00:13:59 GMT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8025 )

Change subject: IMPALA-5844: use a MemPool for expr result allocations
......................................................................


Patch Set 9:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/agg-fn-evaluator.h
File be/src/exprs/agg-fn-evaluator.h:

http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/agg-fn-evaluator.h@65
PS9, Line 65: Evaluator-managed 
> let's use same terminology as in scalar-expr-evaluator.h (permanent)
Done


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/agg-fn-evaluator.h@93
PS9, Line 93: all input values
> what do we mean by that?
I think it means input evaluators.


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/agg-fn-evaluator.h@96
PS9, Line 96:   /// So, it's not safe to use cloned evaluators concurrently.
> does expr_*_pool need to be unique, or can they be shared with another eval
They can be shared. Augmented the Create comment.


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/agg-fn-evaluator.h@117
PS9, Line 117: expr-managed
> permanent?
I meant to replace all the instances of this phrase but didn't do as good a job as I thought.
I think it's sufficient to say that the allocation comes from FunctionContext::Allocate().


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/agg-fn-evaluator.h@142
PS9, Line 142: 'results_pool'
> that's not actually a variable, is it? maybe say "results mem pool"?
Done


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/agg-fn-evaluator.h@175
PS9, Line 175:   /// This should generally be used via ScopedResultsPool instead of directly.
> I'm not sure I follow why we need this as opposed to always just requiring 
I don't think we can change the UDAF interfaces right now to thread the argument all the way
through. I'm also not quite sure what it would look like - forcing users to thread MemPools
through their code seems a bit error-prone.

I did consider having the interfaces take a MemPool as an argument and temporarily swapping
in the pool for the single function call, but that seemed potentially costly and expensive
(would have to change the codegen path too).

To answer the final question, there are cases in the aggregation node (e.g. CleanUpHashPartitions())
where the results of Serialize()/Finalize() are discarded so it makes sense to just send them
to expr_results_pool_.


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/scalar-expr-evaluator.h
File be/src/exprs/scalar-expr-evaluator.h:

http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/scalar-expr-evaluator.h@217
PS9, Line 217: fn_ctxs_'
> missing opening '
I wonder if it was intended to be a possessive apostrophe. Reworded it anyway.


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/scalar-expr-evaluator.h@219
PS9, Line 219: Allocations made from this pool must live until after the evaluator is closed.
> this is great information (and similar in expr_results_pool_), but since ag
Added to the Create() methods.


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/scalar-expr-evaluator.h@225
PS9, Line 225: except when the evaluator is in the
             :   /// middle of evaluating the expression.
> that's kind of already implied since these things being used by a single th
Yeah I think that's implied since it's single threaded and we've stated that the expr evaluator
wont free it itself.


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/exprs/scalar-expr-evaluator.h@227
PS9, Line 227:   MemPool* const expr_results_pool_;
> what's the distinction between these pools and the pair that live in the Fu
There's the same pools, but only AggFnEvaluator has a 1:1 relationship with a FunctionContext.
ScalarExprEvaluator does not necessarily have any function contexts, e.g. if the tree is just
a literal constant. 

I did just notice that we don't need expr_results_pool_, it's only passed through to the FunctionContext,
so I removed it.


http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/udf/udf-internal.h
File be/src/udf/udf-internal.h:

http://gerrit.cloudera.org:8080/#/c/8025/9/be/src/udf/udf-internal.h@101
PS9, Line 101: AllocateResults
> AllocateForResults
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4ba5a7542ed90a49a4b5586c040b5985a7d45b61
Gerrit-Change-Number: 8025
Gerrit-PatchSet: 9
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Fri, 29 Sep 2017 00:13:59 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message