Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 5B2FF20049B for ; Mon, 14 Aug 2017 20:13:36 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 59CA8165A46; Mon, 14 Aug 2017 18:13:36 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9DC2B165A43 for ; Mon, 14 Aug 2017 20:13:35 +0200 (CEST) Received: (qmail 64031 invoked by uid 500); 14 Aug 2017 18:13:34 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 64020 invoked by uid 99); 14 Aug 2017 18:13:33 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Aug 2017 18:13:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 5066EC00E2 for ; Mon, 14 Aug 2017 18:13:33 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id dY2GSNWbWGzr for ; Mon, 14 Aug 2017 18:13:30 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id AA11A5F5F7 for ; Mon, 14 Aug 2017 18:13:30 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v7EIDTfg021031; Mon, 14 Aug 2017 18:13:29 GMT Message-Id: <201708141813.v7EIDTfg021031@ip-10-146-233-104.ec2.internal> Date: Mon, 14 Aug 2017 18:13:29 +0000 From: "Tim Armstrong (Code Review)" To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Michael Ho , Matthew Jacobs Reply-To: tarmstrong@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-3931=3A_arbitrary_fixed-size_uda_intermediate_types=0A?= X-Gerrit-Change-Id: Ife90cf27989f98ffb5ef5c39f1e09ce92e8cb87c X-Gerrit-ChangeURL: X-Gerrit-Commit: 1ca8fb0ed903f1f38f0849dbd3c5dfc6f1073b01 In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.7 archived-at: Mon, 14 Aug 2017 18:13:36 -0000 Tim Armstrong has posted comments on this change. Change subject: IMPALA-3931: arbitrary fixed-size uda intermediate types ...................................................................... Patch Set 8: (3 comments) http://gerrit.cloudera.org:8080/#/c/7526/8/be/src/codegen/llvm-codegen.cc File be/src/codegen/llvm-codegen.cc: PS8, Line 467: // Represent this as an array of bytes. : return ArrayType::get(GetType(TYPE_TINYINT), type.len); > IMHO, this change in behavior makes GetType() an error prone interface. Giv I think the relationship is analogous to the existing cases: int64 <==> BIGINT StringValue <==> STRING TimestampValue <==> TIMESTAMP int128 <==> DECIMAL(38,7) int8[n] <==> FIXED_UDA_INTERMEDIATE[n] It's already a bug if you call this with TYPE_STRING when constructing an IR function argument since it returns StringValue (we should rename string_val_type and timestamp_val_type for sure though). http://gerrit.cloudera.org:8080/#/c/7526/11/be/src/exec/partitioned-aggregation-node.cc File be/src/exec/partitioned-aggregation-node.cc: PS11, Line 1540: bitcast { i64, i8* }* %dst_lowered_ptr to %"struct.impala_udf: > This seems unnecessary now but I guess DCE will eliminate it. Yeah, DCE will get it. I think it's simpler than modifying the codegen interfaces to avoid emitting the instruction in this special case. http://gerrit.cloudera.org:8080/#/c/7526/8/be/src/exprs/agg-fn-evaluator.cc File be/src/exprs/agg-fn-evaluator.cc: PS8, Line 478: TYPE_FIXED_UDA_INTERMEDIATE: > Is TYPE_FIXED_UDA_INTERMEIDATE limited to built-in UDA only ? stddev already uses it internally and there doesn't seem to have been anything preventing declaration of a UDA using CHAR(n). Codegen is disabled though. On master I can declare a UDA with a CHAR intermediate if I load the UDAs from this patch: $ MAKE_CMD=ninja ./testdata/bin/copy-udfs-udas.sh -build $ git checkout origin/master $ git rev-parse HEAD df9ecdc45a34b176f2ee6eda8e12d0195fba9ae7 [localhost:21000] > create aggregate function udasum(int) RETURNS int INTERMEDIATE CHAR(10) LOCATION '/test-warehouse/libTestUdas.so' UPDATE_FN='AggCharIntermediateUpdate' INIT_FN='AggCharIntermediateInit' MERGE_FN='AggCharIntermediateMerge' FINALIZE_FN='AggCharIntermediateFinalize'; Query: create aggregate function udasum(int) RETURNS int INTERMEDIATE CHAR(10) LOCATION '/test-warehouse/libTestUdas.so' UPDATE_FN='AggCharIntermediateUpdate' INIT_FN='AggCharIntermediateInit' MERGE_FN='AggCharIntermediateMerge' FINALIZE_FN='AggCharIntermediateFinalize' Fetched 0 row(s) in 0.17s [localhost:21000] > select year, month, day, udasum(int_col), sum(int_col) from functional.alltypesagg group by year, month, day; Query: select year, month, day, udasum(int_col), sum(int_col) from functional.alltypesagg group by year, month, day Query submitted at: 2017-08-14 10:54:10 (Coordinator: http://tarmstrong-box:25000) Query progress can be monitored at: http://tarmstrong-box:25000/query_plan?query_id=3546da729d693928:a81fcd1b00000000 Socket error 104: Connection reset by peer [Not connected] > Goodbye tarmstrong So I guess it's allowed but broken. Arguably it might be better to disallow it instead of fixing it. -- To view, visit http://gerrit.cloudera.org:8080/7526 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ife90cf27989f98ffb5ef5c39f1e09ce92e8cb87c Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes