impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4192: Disentangle Expr and ExprContext
Date Tue, 23 May 2017 16:21:42 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4192: Disentangle Expr and ExprContext
......................................................................


Patch Set 14:

(32 comments)

Made it through, finally... Overall looks way saner and easier to understand than the previous
state of things.

http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

Line 694:   AggFnEvaluator::ShallowClone(parent->pool_, agg_fn_pool.get(),
This should go into 'partition_pool_'. Otherwise if we have an Agg in a subplan these will
accumulate every Reset(). E.g. there would be a bug like https://gerrit.cloudera.org/#/c/1141/


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/agg-fn-evaluator.cc
File be/src/exprs/agg-fn-evaluator.cc:

Line 84: AggFnEvaluator::~AggFnEvaluator() {
Can we add a DCHECK() that it was closed?


PS14, Line 87: inline
probably not needed


PS14, Line 91: inline
probably not needed


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/agg-fn-evaluator.h
File be/src/exprs/agg-fn-evaluator.h:

Line 250
Consider leaving these as inline functions in the header - AnalyticEvalNode is still interpreted
so the function call overhead may actually affect perf there.


Line 72:   /// Convenience functions for creating evaluators for multiple aggregate functions.
Mention that the caller is responsible for closing the created evaluators regardless of success
or failure. That is subtle. Or we could just have Create() handle that internally.

This comment may also apply to ScalarExprEvaluator.


Line 174:   /// True if this evaluator is created from a Clone() call.
ShallowClone()


PS14, Line 175: is_clone_
Maybe is_shallow_clone_ to be more precise - although there is only one type of cloning supported
so arguably not necessary to disambiguate.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/agg-fn.cc
File be/src/exprs/agg-fn.cc:

Line 87:   RETURN_IF_ERROR(LibCache::instance()->GetSoFunctionPtr(fn_.hdfs_location,
Nit: I don't think the remainder of this function needs all the vertical whitespace - it's
essentially double-spaced code now


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/agg-fn.h
File be/src/exprs/agg-fn.h:

PS14, Line 52: state
value?. "state" could be taken to mean that it initialises the evaluator or some such thing.


Line 81:   /// TODO: The aggregation node has custom codegen paths for a few of the builtins.
Consider removing the TODO - it's a bit vague and unclear what if anything we want to do.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/expr.h
File be/src/exprs/expr.h:

Line 53: /// --- Expr overview
Nice overview


Line 81:   static Status CreateTree(const TExpr& texpr, ObjectPool* pool, Expr* root);
I think this should be protected since it's only called from ScalarExpr and AggFn.


Line 94:   bool is_constant() const { return is_constant_; }
I think is_constant() should maybe only be defined in the ScalarExpr sub-hierarchy. It doesn't
really mean anything for AggFn.


Line 100:   /// Simple debug string that provides no expr subclass-specific information.
Comment seems stale since there's no implementation.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/hive-udf-call.h
File be/src/exprs/hive-udf-call.h:

Line 28: using namespace impala_udf;
I guess some of these 'using namespace' instances were pre-existing - still seems like it
contradicts our coding standards.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/literal.h
File be/src/exprs/literal.h:

Line 28: using namespace impala_udf;
See other comment about using namespace in headers.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

Line 154:   // Owner of mem_pool_ needs to call FreeAll() on it.
We don't really know what the owner wants to do. Maybe something like "Memory allocated by
'fn_ctx_' is still living in 'mem_pool_'. The owner of 'mem_pool_' is responsible for freeing
it when appropriate."


Line 436:   // TODO: is there a better way to do this?
Consider removing TODO - I don't think we're likely to fix it.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/scalar-expr-evaluator.h
File be/src/exprs/scalar-expr-evaluator.h:

Line 29: using namespace impala_udf;
I'm not sure if this works, but maybe if we just move this "using" into the impala namespace
it won't pollute any other namespaces. That would still avoid having impala_udf:: everywhere.


PS14, Line 43: types
type


Line 122:   void* GetValue(const ScalarExpr& e, const TupleRow* row);
Maybe GetSubExprValue()? I don't feel strongly but might make the intent clearer.


Line 144:   Status GetError(int start_idx = 0, int end_idx = -1) const WARN_UNUSED_RESULT;
Arguments maybe need explanation.


Line 147:   void PrintValue(const TupleRow* row, std::string* str);
Does this evaluate the expr?


Line 153:   /// The last two are helper functions.
Not sure which "last two" refers to.


Line 157:   /// Frees all local allocations made by fn_ctxs_. This can be called when result
Thinking out loud about future work: I think to fix memory transfer I'll need a way to transfer
local allocations out of the function contexts and attach them to a RowBatch. 

I think to do this we'd need to pass in two different MemPools - one to back the FreePool
and one to back the local allocations. Then instead of FreeLocalAllocations() we'd just free
or transfer things out of the MemPool.

We don't need to do this in this patch since it's a big behavioural change but it seems like
that will work out, right?


PS14, Line 158:  last two
Same here - maybe out of sync with the code?


Line 183:   /// Users of private GetValue() or 'pool_'.
It might make sense to move the methods that the "friend class" is mean to apply to under
"protected:", just so there's a clearer delineation between those methods and the truly private
methods.


PS14, Line 204: the evaluation
maybe something like "Stores the result of the last evaluation of the the root expr"


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/scalar-expr.cc
File be/src/exprs/scalar-expr.cc:

PS14, Line 398: subclassed
I think this should be "overridden"


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/scalar-fn-call.cc
File be/src/exprs/scalar-fn-call.cc:

Line 164:   bool is_interpreted = scalar_fn_wrapper_ == nullptr;
Thank you! This is much clearer.


http://gerrit.cloudera.org:8080/#/c/5483/14/be/src/exprs/slot-ref.h
File be/src/exprs/slot-ref.h:

Line 24: using namespace impala_udf;
I think generally we try to avoid using entire namespaces in the headers. Maybe you can scope
it to the SlotRef class?


-- 
To view, visit http://gerrit.cloudera.org:8080/5483
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iefdc9aeeba033355cb9497e3a5d2363627dcf2f3
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message