impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4192: Disentangle Expr and ExprContext
Date Wed, 03 May 2017 20:52:25 GMT
Michael Ho has posted comments on this change.

Change subject: IMPALA-4192: Disentangle Expr and ExprContext

Patch Set 8:

File be/src/exec/aggregation-node.h:

Line 84:   std::vector<ScalarExpr*> grouping_exprs_;
> where are the evaluators for these?
File be/src/exec/analytic-eval-node.h:

Line 181:   /// ProcessChildBatch().
> rename variables ending in _ctx if they're referring to now-obsolete class 
File be/src/exec/exec-node.h:

Line 49: /// flag gets set.
> leave todo to outline very briefly what's going to happen with PlanNode.

Line 285:   boost::scoped_ptr<MemTracker> expr_mem_tracker_;
> is there a need to have one per exec node? how about one per fragment insta
There will be one per PlanNode in the future. Currently, it still counts towards the ExecNode's
MemTracker. TODO added.
File be/src/exprs/agg-fn-evaluator.h:

Line 54: /// AggFnEvaluator contains runtime state and implements wrapper functions which
> "runtime state"

Line 55: /// the input TupleRow into AnyVal format expected by UDAF functions defined in AggFn.
> stick to singular or plural

Line 70:       const AggFn& agg_fn, AggFnEvaluator** evaluator) WARN_UNUSED_RESULT;
> why not make agg_fn a const&?

Line 80:   /// and caches all constant input arguments.
> what does agg_fn_ctx mean?
The 'agg_fn_ctx' part of the comment is removed.

Line 88:   /// Avoid the overhead of re-initializing an evaluator (e.g. calling GetConstVal()
> how big exactly is that overhead?
Depends on how number and size of the input expressions.

Line 94:   void Clone(
> do we really need this?
It's used in PAGG::Partition logic for resource isolation.

Line 117:   /// called either to drive the UDA's Update() or Merge() function, depending on
> function, depending

Line 136:   /// merge phases.
> comment on the types of src and dst. also, those aren't great names, they s

Line 159: 
> function comment missing

Line 160:   /// Free local allocations made in UDA functions and input arguments' evaluators.
> why not a const&?

Line 205:   inline const ColumnType& intermediate_type() const;
> long lines
File be/src/exprs/agg-fn.h:

Line 21: 
> ..

Line 28: class MemPool;
> the comment about "static states"/compile-time information applies to all e

Line 97:   /// of the output value. Returns error status on failure.
> can intermediate and output slodesc be null?
No. Changed to const ref instead.

Line 101:       WARN_UNUSED_RESULT;
> who owns agg_fn?
It lives in state->obj_pool(). Comments added.
File be/src/exprs/case-expr.h:

Line 38:  protected:
> this is needed?
Yes so we can call CaseExpr() constructor.

Line 41: 
> it would be nice to keep the fnctx index out of every class. is it possible
File be/src/exprs/expr.h:

Line 19: #ifndef IMPALA_EXPRS_EXPR_H
> move directly above class, not sure why the class comment was up here to st

Line 34: using namespace impala_udf;
> explain function context index concept here
The latest patch avoids exposing function context index outside scalar-expr-* so the comments
will be kept in scalar-expr-* instead.

Line 34: using namespace impala_udf;
> also, it would be good to have an explanation of the general class hierarch

Line 81:   /// status on failure.
> why does this not take an Expr**?
'fn_ctx_idx_ptr' removed. It doesn't take Expr** as 'root' is expected to be created by the
caller (constructors of ScalarExpr or AggFn).

Line 113:   /// some ScalarExpr such as ScalarFnCall. NULL if it's not used.
> initialize all pointers to nullptr here, that way you can't forget it in th

Line 134:   /// Creates an expr tree with root 'parent' via depth-first traversal.
> doesn't look like a useful function (if you see it being called you probabl

Line 137:   ///   pool: Object pool in which Expr created from nodes are stored
> "to create expr trees for subexpressions"
File be/src/exprs/scalar-expr-evaluator.h:

Line 40: /// reference to the root of a ScalarExpr tree, runtime state (e.g. FunctionContexts)
> runtime state

Line 62:   /// FunctionContexts needed during evaluation. Allocations from this evaluator
> runtimestate already has an objectpool
There are places in which the evaluator will have shorter life span than the entire fragment
(e.g. expr evaluators in scanner threads or partitions in PAGG). So, it's easier to separate

Line 63:   /// be from 'mem_pool'. The newly created evaluator will be stored in 'pool' and
> why not a const&?

Line 72:       std::vector<ScalarExprEvaluator*>* evaluators) WARN_UNUSED_RESULT;
> the location?
Yes, ideally, that should be done in Expr::Init(). This is being tracked by IMPALA-4743. Will
add a TODO here.

Line 73: 
> this clone business is tricky, would be nice to remove it.
I believe it'd be easier to do it once IMPALA-4743 is fixed. Clone() is a performance optimization
for exec nodes with large expression. Plan to tackle it in a separate change to avoid potential

Line 111:   /// should only be called after Open() has been called on this expr. Returns an
error if
> can't this be subsumed by GetValue(nullptr)?
There is subtlety in this. "expr" is potentially a sub-expression of root_ (e.g. input expressions
to a ScalarFnCall). That's why it's not the same as GetValue(nullptr) as the latter will evaluate
the root expr.

Line 112:   /// there was an error evaluating the expression or if memory could not be allocated
> why pass in expr? how is that different from the expr of Create()?
Please see reply to previous comment. expr is a sub-expression.

Line 179:   friend class ScalarFnCall;
> shouldn't they go into an objectpool?
They do. Comments updated.

Line 208: 
> comment why this is here and not in expr

Line 210:   /// -1 if no scale has been specified (currently the scale is only set for doubles
> reconsider whether this comment is useful
File be/src/exprs/scalar-expr.h:

Line 28: #include "common/status.h"
> stick to singular or plural in this and the following sentence.

Line 40:   class Type;
> ..evaluator,

Line 44: namespace impala {
> move comments about evaluator to evaluator class comment.

Line 50: class RowDescriptor;
> ..builder, which inlines ...functions, or...

Line 107: 
> why do we need this and Expr::CreateTree?
Expr::CreateTree() is the shared code between AggFn and ScalarExpr to unpack TExpr. This function
also calls the ScalarExpr's Init() function in addition to creating the Expr tree.

Line 113: 
> could you also add warn_unused_result to all other .h files that you're tou
Will try my best to cover functions (not .h files) I touched. This change is already large
enough to incorporate all these cleanups.

Line 223:   virtual DecimalVal GetDecimalVal(ScalarExprEvaluator*, const TupleRow*) const;
> what is the non-static state in the expr tree?

Line 280:   /// 'fn_ctx_idx_' is the index into the FunctionContext vector in ScalarExprEvaluator
> would be good to have that in the Expr class comment, maybe with some more 
The new patch no longer exposes 'fn_ctx_idx' to expr.*. Will keep all these comments in 'scalar-expr*'
File be/src/exprs/

PS7, Line 197: ) evaluator->output_scale_ = scale_arg->va
There is a subtle bug here. This ScalarFnCall isn't necessary evaluator->root_. It's incorrect
to set the output_scale_ in that case.
File be/src/runtime/

Line 379:     const vector<TExpr>& thrift_output_exprs) {
> unused parameter
it's used in other subclasses of DataSink.
File be/src/runtime/data-stream-sender.h:

Line 133:   /// per-row partition values for shuffling exchange;
> also leave todo that the exprs go into a separate class.
File be/src/runtime/

Line 198:   scoped_ptr<MemPool> mem_pool_;
> more mem pools?
Used by CreateTupleComparator() below.

Line 207:   vector<ScalarExpr*> ordering_exprs_;
> ?
Comments added.
File be/src/runtime/descriptors.h:

Line 288:   const std::vector<TExpr>& thrift_partition_key_exprs_;
> does this todo still apply, given that the descriptor tbl itself moves into
The person making the change to move the descriptor tbl into query state should remove it.

Line 291:   /// are Literals.
> why is there an evaluator in here, instead of the expr? this is shared by a
The evaluator is also shared by all fragment instance executors as it's just a literal.

Please see comments at HdfsPartitionDescriptor::partition_key_value_evaluators()
File be/src/runtime/sorter.h:

Line 107:   /// Initializes the evaluators for materializing the tuples.
> ?

Line 120:   /// Free any local allocations made when materializing and sorting the tuples.
> ?
File be/src/service/

Line 76: // Serializes expression value 'value' to thrift structure TColumnValue 'col_val'.
> function comment missing

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefdc9aeeba033355cb9497e3a5d2363627dcf2f3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Michael Ho <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message