impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4120: Incorrect results with LEAD() analytic function
Date Tue, 18 Oct 2016 00:29:57 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4120: Incorrect results with LEAD() analytic function
......................................................................


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4740/1//COMMIT_MSG
Commit Message:

PS1, Line 18: AggFnEvaluatorthese
?


http://gerrit.cloudera.org:8080/#/c/4740/1/be/src/exec/analytic-eval-node.cc
File be/src/exec/analytic-eval-node.cc:

PS1, Line 389: AggFnEvaluator::GetValue(evaluators_, fn_ctxs_, curr_tuple_, result_tuple,
             :       cur_tuple_pool);
wrt my comment in AggFnEvaluator, what if we didn't change GetValue() and instead we called
a helper fn here that moved any backing memory from result_tuple into cur_tuple_pool, e.g.

// Copies memory backing Strings (in analytic result slots) from result_tuple into memory
allocated from cur_tuple_pool.
RelocateStringMemory(result_tuple, cur_tuple_pool);

(or something like that)


http://gerrit.cloudera.org:8080/#/c/4740/1/be/src/exprs/agg-fn-evaluator.cc
File be/src/exprs/agg-fn-evaluator.cc:

PS1, Line 499: v = StringVal::null();
this means the allocation failed, right? should we be propagating errors? seems like that'd
lead to incorrect results under low-memory conditions.


http://gerrit.cloudera.org:8080/#/c/4740/1/be/src/exprs/agg-fn-evaluator.h
File be/src/exprs/agg-fn-evaluator.h:

PS1, Line 149:   /// Puts the finalized value from Tuple* src in Tuple* dst just as Finalize()
does.
             :   /// 'pool' is the MemPool used for allocating memory for the finalized value
(e.g.
             :   /// StringVal). However, unlike Finalize(), GetValue() does not clean up
state in src.
             :   /// GetValue() can be called repeatedly with the same src. Only used internally
for
             :   /// analytic fn builtins.
             :   void GetValue(FunctionContext* agg_fn_ctx, Tuple* src, Tuple* dst, MemPool*
pool);
I see how this and the below changes to the GetValue() path (all the way through SerializeOrFinalize)
fix the issue, but I'm worried it's going to be more confusing when the interface isn't symmetrical,
e.g. other Serialize and Finalize can't allocate mem out of pool even though it wouldn't be
crazy for that to be allowed.

What if the caller handled copying GetValue memory instead of having GetValue() handle the
string copy? We could have a helper fn somewhere (maybe just in AggFnEvaluator, or maybe there's
a better place) which moved backing memory into a specified pool. Does that make sense?


-- 
To view, visit http://gerrit.cloudera.org:8080/4740
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I85bb1745232d8dd383a6047c86019c6378ab571f
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message