impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3931: arbitrary fixed-size uda intermediate types
Date Tue, 15 Aug 2017 21:08:19 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3931: arbitrary fixed-size uda intermediate types
......................................................................


Patch Set 12:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

PS12, Line 555: the StringVal is already set up
I'm not sure I follow this comment given that this function is setting the *Value not the
*Val, right?  Is it saying that the slot was already modified since the StringVal was referencing
the destination slot itself and updates were done in place?


http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

PS12, Line 52: TYPE_FIXED_UDA_INTERMEDIATE
why isn't this a byte array if that's how it is represented? I guess we want the pointer indirection
to pass the thing around and so StringVal is convenient (even though runtime len shouldn't
be needed)?

I guess it's clearer once you read the comment in udf.h, but it's kinda hidden.


PS12, Line 193: matching
              :   /// native type
isn't the native type for intermediate just char array?  By "native" do we mean the slot representation?
Should we say that instead, or do we use "native" to mean the slot type (as opposed to the
C type or lowered type, etc)?


PS12, Line 195: raw_val_ptr
is this a pointer to the slot itself? if so, why not just specify it like that? (here and
elsewhere)

Isn't that actually required, given the assumption that StoreToNativePtr() makes about the
fixed intermediate case being a no-op?


http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

PS12, Line 240: the
delete


http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

Line 888
it might be a bit confusing that now StringVal's can have different management rules depending
on the actual intermediate type. But maybe any alternative (e.g. a new FixedArrayVal that's
just a 'char *' (and len is determined statically from the type) adds complexity that's not
worth it?


http://gerrit.cloudera.org:8080/#/c/7526/12/be/src/udf/udf.h
File be/src/udf/udf.h:

PS12, Line 81: TYPE_FIXED_BUFFER
should we rename that to TYPE_CHAR, or would that break compat?

Alternatively, is there a reason TYPE_FIXED_UDA_INTERMEDIATE has to be distinct from this?


PS12, Line 346: For UDAs that need a complex data structure as the intermediate state, the
              : /// intermediate type should be string and the UDA can cast the ptr to the
structure
              : /// it is using.
              : ///
              : /// Memory Management: For allocations that are not returned to Impala, the
UDA should use
              : /// the FunctionContext::Allocate()/Free() methods. In general, Allocate()
is called in
              : /// Init(), and then Free() must be called in both Serialize() and Finalize(),
since
              : /// either of these functions may be called to clean up the state. For StringVal
              : /// allocations returned to Impala (e.g. returned by UdaSerialize()), the
UDA should
              : /// allocate the result via StringVal(FunctionContext*, int) ctor or the function
              : /// StringVal::CopyFrom(FunctionContext*, const uint8_t*, size_t) and Impala
will
              : /// automatically handle freeing it.
does any of that need updating to account for the fact that these new StringVal's point at
the slot itself?


http://gerrit.cloudera.org:8080/#/c/7526/12/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java:

Line 57:   private static final int RANK_INTERMEDIATE_SIZE = 16;
are these constants verified for consistency in any way?


-- 
To view, visit http://gerrit.cloudera.org:8080/7526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ife90cf27989f98ffb5ef5c39f1e09ce92e8cb87c
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message