impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3931: arbitrary fixed-size uda intermediate types
Date Mon, 14 Aug 2017 18:13:29 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3931: arbitrary fixed-size uda intermediate types
......................................................................


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/7526/8/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

PS8, Line 467: // Represent this as an array of bytes.
             :       return ArrayType::get(GetType(TYPE_TINYINT), type.len);
> IMHO, this change in behavior makes GetType() an error prone interface. Giv
I think the relationship is analogous to the existing cases:

  int64          <==> BIGINT
  StringValue    <==> STRING
  TimestampValue <==> TIMESTAMP
  int128         <==> DECIMAL(38,7)
  int8[n]        <==> FIXED_UDA_INTERMEDIATE[n]


It's already a bug if you call this with TYPE_STRING when constructing an IR function argument
since it returns StringValue (we should rename string_val_type and timestamp_val_type for
sure though).


http://gerrit.cloudera.org:8080/#/c/7526/11/be/src/exec/partitioned-aggregation-node.cc
File be/src/exec/partitioned-aggregation-node.cc:

PS11, Line 1540:     bitcast { i64, i8* }* %dst_lowered_ptr to %"struct.impala_udf:
> This seems unnecessary now but I guess DCE will eliminate it.
Yeah, DCE will get it. I think it's simpler than modifying the codegen interfaces to avoid
emitting the instruction in this special case.


http://gerrit.cloudera.org:8080/#/c/7526/8/be/src/exprs/agg-fn-evaluator.cc
File be/src/exprs/agg-fn-evaluator.cc:

PS8, Line 478: TYPE_FIXED_UDA_INTERMEDIATE:
> Is TYPE_FIXED_UDA_INTERMEIDATE limited to built-in UDA only ?
stddev already uses it internally and there doesn't seem to have been anything preventing
declaration of a UDA using CHAR(n). Codegen is disabled though. On master I can declare a
UDA with a CHAR intermediate if I load the UDAs from this patch:

  $ MAKE_CMD=ninja ./testdata/bin/copy-udfs-udas.sh -build
  $ git checkout origin/master
  $ git rev-parse HEAD
  df9ecdc45a34b176f2ee6eda8e12d0195fba9ae7


  [localhost:21000] > create aggregate function udasum(int) RETURNS int
        INTERMEDIATE CHAR(10) LOCATION '/test-warehouse/libTestUdas.so' UPDATE_FN='AggCharIntermediateUpdate'
        INIT_FN='AggCharIntermediateInit' MERGE_FN='AggCharIntermediateMerge'
        FINALIZE_FN='AggCharIntermediateFinalize';
Query: create aggregate function udasum(int) RETURNS int
        INTERMEDIATE CHAR(10) LOCATION '/test-warehouse/libTestUdas.so' UPDATE_FN='AggCharIntermediateUpdate'
        INIT_FN='AggCharIntermediateInit' MERGE_FN='AggCharIntermediateMerge'
        FINALIZE_FN='AggCharIntermediateFinalize'
  Fetched 0 row(s) in 0.17s
  [localhost:21000] > select year, month, day, udasum(int_col), sum(int_col) from functional.alltypesagg
group by year, month, day;
  Query: select year, month, day, udasum(int_col), sum(int_col) from functional.alltypesagg
group by year, month, day
  Query submitted at: 2017-08-14 10:54:10 (Coordinator: http://tarmstrong-box:25000)
  Query progress can be monitored at: http://tarmstrong-box:25000/query_plan?query_id=3546da729d693928:a81fcd1b00000000
  Socket error 104: Connection reset by peer
[Not connected] > Goodbye tarmstrong


So I guess it's allowed but broken.

Arguably it might be better to disallow it instead of fixing it.


-- 
To view, visit http://gerrit.cloudera.org:8080/7526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ife90cf27989f98ffb5ef5c39f1e09ce92e8cb87c
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message