Return-Path: X-Original-To: apmail-drill-issues-archive@minotaur.apache.org Delivered-To: apmail-drill-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 30D9D17EEB for ; Fri, 20 Mar 2015 20:43:42 +0000 (UTC) Received: (qmail 8907 invoked by uid 500); 20 Mar 2015 20:43:39 -0000 Delivered-To: apmail-drill-issues-archive@drill.apache.org Received: (qmail 8877 invoked by uid 500); 20 Mar 2015 20:43:39 -0000 Mailing-List: contact issues-help@drill.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@drill.apache.org Delivered-To: mailing list issues@drill.apache.org Received: (qmail 8867 invoked by uid 99); 20 Mar 2015 20:43:39 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Mar 2015 20:43:39 +0000 Date: Fri, 20 Mar 2015 20:43:39 +0000 (UTC) From: "Jacques Nadeau (JIRA)" To: issues@drill.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (DRILL-2504) Aggregate query with grouping results in Error MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DRILL-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-2504: ---------------------------------- Component/s: (was: Execution - Relational Operators) Functions - Drill > Aggregate query with grouping results in Error > ---------------------------------------------- > > Key: DRILL-2504 > URL: https://issues.apache.org/jira/browse/DRILL-2504 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill > Affects Versions: 0.8.0 > Environment: 4 node cluster > {code} > 0: jdbc:drill:> select * from sys.version; > +------------+----------------+-------------+-------------+------------+ > | commit_id | commit_message | commit_time | build_email | build_time | > +------------+----------------+-------------+-------------+------------+ > | f658a3c513ddf7f2d1b0ad7aa1f3f65049a594fe | DRILL-2209 Insert ProjectOperator with MuxExchange | 09.03.2015 @ 01:49:18 EDT | Unknown | 09.03.2015 @ 04:52:49 EDT | > +------------+----------------+-------------+-------------+------------+ > 1 row selected (0.062 seconds) > {code} > Reporter: Khurram Faraaz > Assignee: Chris Westin > > The below aggregate query with group by over distinct/non-distinct data results in an Exception. Please note that I had set enable_hashagg=false and I was querying from a CSV file. Query was run over a four node cluster. > alter system set `planner.enable_hashagg`=true; > alter session set `planner.enable_hashagg`=true; > {code} > 0: jdbc:drill:> alter system set `planner.enable_hashagg`=false; > +------------+------------+ > | ok | summary | > +------------+------------+ > | true | planner.enable_hashagg updated. | > +------------+------------+ > 1 row selected (0.075 seconds) > 0: jdbc:drill:> select columns[4], sum(columns[0]), count(distinct columns[1]), max(columns[2]), count(distinct columns[3]), max(columns[5]), min(columns[6]), avg(columns[7]) > . . . . . . . > from `conftest.csv` > . . . . . . . > group by columns[4]; > Query failed: Query stopped., Failure while trying to materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: [castINT(BIT-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. [ 6cd09ba7-3e4b-4b3b-b111-39f74f53e1b0 on centos-01.qa.lab:31010 ] > Error: exception while executing query: Failure while executing query. (state=,code=0) > {code} > {code} > Stack trace from drillbit.log > 2015-03-19 17:47:43,123 [2af4f441-8c04-99f9-1a12-a55a7c72ece7:frag:0:0] ERROR o.a.d.e.w.f.AbstractStatusReporter - Error bab1babd-48fe-4719-8a77-dc5826027ba7: Failure while running fragment. > org.apache.drill.exec.exception.SchemaChangeException: Failure while trying to materialize incoming schema. Errors: > Error in expression at index -1. Error: Missing function implementation: [castINT(BIT-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema(ProjectRecordBatch.java:390) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:78) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:97) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:57) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:121) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] > 2015-03-19 17:47:43,124 [2af4f441-8c04-99f9-1a12-a55a7c72ece7:frag:0:0] INFO o.a.drill.exec.work.foreman.Foreman - State change requested. RUNNING --> FAILED > org.apache.drill.exec.rpc.RemoteRpcException: Failure while running fragment., Failure while trying to materialize incoming schema. Errors: > Error in expression at index -1. Error: Missing function implementation: [castINT(BIT-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. [ bab1babd-48fe-4719-8a77-dc5826027ba7 on centos-01.qa.lab:31010 ] > [ bab1babd-48fe-4719-8a77-dc5826027ba7 on centos-01.qa.lab:31010 ] > at org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:95) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:154) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:176) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:123) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] > 2015-03-19 17:47:43,141 [2af4f441-8c04-99f9-1a12-a55a7c72ece7:frag:0:0] INFO o.a.drill.exec.work.foreman.Foreman - State change requested. FAILED --> COMPLETED > 2015-03-19 17:47:43,141 [2af4f441-8c04-99f9-1a12-a55a7c72ece7:frag:0:0] WARN o.a.drill.exec.work.foreman.Foreman - Dropping request to move to COMPLETED state as query is already at FAILED state (which is terminal). > 2015-03-19 17:47:43,142 [2af4f441-8c04-99f9-1a12-a55a7c72ece7:frag:0:0] ERROR o.a.drill.exec.work.foreman.Foreman - Error 82bd01dc-d27d-47e4-836d-426f3625511f: RemoteRpcException: Failure while running fragment., Failure while trying to materialize incoming schema. Errors: > Error in expression at index -1. Error: Missing function implementation: [castINT(BIT-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. [ bab1babd-48fe-4719-8a77-dc5826027ba7 on centos-01.qa.lab:31010 ] > [ bab1babd-48fe-4719-8a77-dc5826027ba7 on centos-01.qa.lab:31010 ] > org.apache.drill.exec.rpc.RemoteRpcException: Failure while running fragment., Failure while trying to materialize incoming schema. Errors: > Error in expression at index -1. Error: Missing function implementation: [castINT(BIT-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. [ bab1babd-48fe-4719-8a77-dc5826027ba7 on centos-01.qa.lab:31010 ] > [ bab1babd-48fe-4719-8a77-dc5826027ba7 on centos-01.qa.lab:31010 ] > at org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:95) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:154) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:176) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:123) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:303) [drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT] > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] > {code} > Physical plan for the failing query > {code} > 00-00 Screen : rowType = RecordType(ANY EXPR$0, ANY EXPR$1, BIGINT EXPR$2, ANY EXPR$3, BIGINT EXPR$4, ANY EXPR$5, ANY EXPR$6, ANY EXPR$7): rowcount = 29.5, cumulative cost = {832.95 rows, 13844.73454409817 cpu, 0.0 io, 0.0 network, 5852.799999999999 memory}, id = 2049 > 00-01 Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2], EXPR$3=[$3], EXPR$4=[$4], EXPR$5=[$5], EXPR$6=[$6], EXPR$7=[$7]) : rowType = RecordType(ANY EXPR$0, ANY EXPR$1, BIGINT EXPR$2, ANY EXPR$3, BIGINT EXPR$4, ANY EXPR$5, ANY EXPR$6, ANY EXPR$7): rowcount = 29.5, cumulative cost = {830.0 rows, 13841.78454409817 cpu, 0.0 io, 0.0 network, 5852.799999999999 memory}, id = 2048 > 00-02 Project(EXPR$0=[$0], EXPR$1=[CASE(=($2, 0), null, $1)], EXPR$2=[$11], EXPR$3=[$3], EXPR$4=[$9], EXPR$5=[$4], EXPR$6=[$5], EXPR$7=[CAST(/(CastHigh(CASE(=($7, 0), null, $6)), $7)):ANY]) : rowType = RecordType(ANY EXPR$0, ANY EXPR$1, BIGINT EXPR$2, ANY EXPR$3, BIGINT EXPR$4, ANY EXPR$5, ANY EXPR$6, ANY EXPR$7): rowcount = 29.5, cumulative cost = {800.5 rows, 13809.78454409817 cpu, 0.0 io, 0.0 network, 5852.799999999999 memory}, id = 2047 > 00-03 MergeJoin(condition=[IS NOT DISTINCT FROM($0, $10)], joinType=[inner]) : rowType = RecordType(ANY EXPR$0, ANY $f1, BIGINT $f2, ANY EXPR$3, ANY EXPR$5, ANY EXPR$6, ANY $f6, BIGINT $f7, ANY EXPR$00, BIGINT EXPR$4, ANY EXPR$01, BIGINT EXPR$2): rowcount = 29.5, cumulative cost = {771.0 rows, 13777.78454409817 cpu, 0.0 io, 0.0 network, 5852.799999999999 memory}, id = 2046 > 00-05 MergeJoin(condition=[IS NOT DISTINCT FROM($0, $8)], joinType=[inner]) : rowType = RecordType(ANY EXPR$0, ANY $f1, BIGINT $f2, ANY EXPR$3, ANY EXPR$5, ANY EXPR$6, ANY $f6, BIGINT $f7, ANY EXPR$00, BIGINT EXPR$4): rowcount = 29.5, cumulative cost = {491.70000000000005 rows, 10177.344151873782 cpu, 0.0 io, 0.0 network, 4814.4 memory}, id = 2038 > 00-08 StreamAgg(group=[{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)], EXPR$3=[MAX($3)], EXPR$5=[MAX($5)], EXPR$6=[MIN($6)], agg#5=[$SUM0($7)], agg#6=[COUNT($7)]) : rowType = RecordType(ANY EXPR$0, ANY $f1, BIGINT $f2, ANY EXPR$3, ANY EXPR$5, ANY EXPR$6, ANY $f6, BIGINT $f7): rowcount = 5.9, cumulative cost = {236.0 rows, 6671.303759649394 cpu, 0.0 io, 0.0 network, 3776.0 memory}, id = 2030 > 00-11 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY EXPR$0, ANY $f1, ANY $f2, ANY $f3, ANY $f4, ANY $f5, ANY $f6, ANY $f7): rowcount = 59.0, cumulative cost = {177.0 rows, 1479.3037596493946 cpu, 0.0 io, 0.0 network, 3776.0 memory}, id = 2029 > 00-14 Project(EXPR$0=[ITEM($0, 4)], $f1=[ITEM($0, 0)], $f2=[ITEM($0, 1)], $f3=[ITEM($0, 2)], $f4=[ITEM($0, 3)], $f5=[ITEM($0, 5)], $f6=[ITEM($0, 6)], $f7=[ITEM($0, 7)]) : rowType = RecordType(ANY EXPR$0, ANY $f1, ANY $f2, ANY $f3, ANY $f4, ANY $f5, ANY $f6, ANY $f7): rowcount = 59.0, cumulative cost = {118.0 rows, 91.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2028 > 00-17 Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/conftest.csv, numFiles=1, columns=[`columns`[4], `columns`[0], `columns`[1], `columns`[2], `columns`[3], `columns`[5], `columns`[6], `columns`[7]], files=[maprfs:/tmp/conftest.csv]]]) : rowType = RecordType(ANY columns): rowcount = 59.0, cumulative cost = {59.0 rows, 59.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2027 > 00-07 Project(EXPR$00=[$0], EXPR$4=[$1]) : rowType = RecordType(ANY EXPR$00, BIGINT EXPR$4): rowcount = 1.0, cumulative cost = {248.8 rows, 3478.440392224387 cpu, 0.0 io, 0.0 network, 1038.4 memory}, id = 2037 > 00-10 StreamAgg(group=[{0}], EXPR$4=[COUNT($1)]) : rowType = RecordType(ANY EXPR$0, BIGINT EXPR$4): rowcount = 1.0, cumulative cost = {247.8 rows, 3470.440392224387 cpu, 0.0 io, 0.0 network, 1038.4 memory}, id = 2036 > 00-13 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY EXPR$0, ANY $f4): rowcount = 5.9, cumulative cost = {241.9 rows, 3376.040392224387 cpu, 0.0 io, 0.0 network, 1038.4 memory}, id = 2035 > 00-16 StreamAgg(group=[{0, 1}]) : rowType = RecordType(ANY EXPR$0, ANY $f4): rowcount = 5.9, cumulative cost = {236.0 rows, 3315.607519298789 cpu, 0.0 io, 0.0 network, 944.0 memory}, id = 2034 > 00-19 Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) : rowType = RecordType(ANY EXPR$0, ANY $f4): rowcount = 59.0, cumulative cost = {177.0 rows, 2843.607519298789 cpu, 0.0 io, 0.0 network, 944.0 memory}, id = 2033 > 00-21 Project(EXPR$0=[ITEM($0, 4)], $f4=[ITEM($0, 3)]) : rowType = RecordType(ANY EXPR$0, ANY $f4): rowcount = 59.0, cumulative cost = {118.0 rows, 67.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2032 > 00-22 Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/conftest.csv, numFiles=1, columns=[`columns`[4], `columns`[3]], files=[maprfs:/tmp/conftest.csv]]]) : rowType = RecordType(ANY columns): rowcount = 59.0, cumulative cost = {59.0 rows, 59.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2031 > 00-04 Project(EXPR$01=[$0], EXPR$2=[$1]) : rowType = RecordType(ANY EXPR$01, BIGINT EXPR$2): rowcount = 1.0, cumulative cost = {248.8 rows, 3478.440392224387 cpu, 0.0 io, 0.0 network, 1038.4 memory}, id = 2045 > 00-06 StreamAgg(group=[{0}], EXPR$2=[COUNT($1)]) : rowType = RecordType(ANY EXPR$0, BIGINT EXPR$2): rowcount = 1.0, cumulative cost = {247.8 rows, 3470.440392224387 cpu, 0.0 io, 0.0 network, 1038.4 memory}, id = 2044 > 00-09 Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY EXPR$0, ANY $f2): rowcount = 5.9, cumulative cost = {241.9 rows, 3376.040392224387 cpu, 0.0 io, 0.0 network, 1038.4 memory}, id = 2043 > 00-12 StreamAgg(group=[{0, 1}]) : rowType = RecordType(ANY EXPR$0, ANY $f2): rowcount = 5.9, cumulative cost = {236.0 rows, 3315.607519298789 cpu, 0.0 io, 0.0 network, 944.0 memory}, id = 2042 > 00-15 Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) : rowType = RecordType(ANY EXPR$0, ANY $f2): rowcount = 59.0, cumulative cost = {177.0 rows, 2843.607519298789 cpu, 0.0 io, 0.0 network, 944.0 memory}, id = 2041 > 00-18 Project(EXPR$0=[ITEM($0, 4)], $f2=[ITEM($0, 1)]) : rowType = RecordType(ANY EXPR$0, ANY $f2): rowcount = 59.0, cumulative cost = {118.0 rows, 67.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2040 > 00-20 Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/conftest.csv, numFiles=1, columns=[`columns`[4], `columns`[1]], files=[maprfs:/tmp/conftest.csv]]]) : rowType = RecordType(ANY columns): rowcount = 59.0, cumulative cost = {59.0 rows, 59.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 2039 > {code} > Query over view returns correct results > {code} > 0: jdbc:drill:> create view v8(col_decimal, col_int, col_bigint, col_char, col_varchar, col_timestmp, col_float, col_double) > . . . . . . . > as > . . . . . . . > select cast(columns[0] as decimal(11,3)), > . . . . . . . > cast(columns[1] as int), > . . . . . . . > cast(columns[2] as bigint), > . . . . . . . > cast(columns[3] as char(9)), > . . . . . . . > cast(columns[4] as varchar(256)), > . . . . . . . > cast(columns[5] as timestamp), > . . . . . . . > cast(columns[6] as float), > . . . . . . . > cast(columns[7] as double) > . . . . . . . > from `conftest.csv`; > +------------+------------+ > | ok | summary | > +------------+------------+ > | true | View 'v8' created successfully in 'dfs.tmp' schema | > +------------+------------+ > 1 row selected (0.102 seconds) > 0: jdbc:drill:> select col_varchar, sum(col_decimal), count(distinct col_int) > . . . . . . . > from v8 > . . . . . . . > group by col_varchar; > 209 rows selected (0.713 seconds) > {code} > 1. Original query over the VIEW returns correct results. > 2. Original query over the CSV file, and with casting in the select statement on csv data, also returned correct results. > So the problem seems to be when we run original aggregate+grouping query over CSV file, when we do not explicitly cast the columns. > {code} > 0: jdbc:drill:> select col_varchar, sum(col_decimal), count(distinct col_int), max(col_bigint), count(distinct col_char), max(col_timestmp), min(col_float), avg(col_double) > . . . . . . . > from v8 > . . . . . . . > group by col_varchar; > +-------------+------------+------------+------------+------------+------------+------------+------------+ > | col_varchar | EXPR$1 | EXPR$2 | EXPR$3 | EXPR$4 | EXPR$5 | EXPR$6 | EXPR$7 | > +-------------+------------+------------+------------+------------+------------+------------+------------+ > 209 rows selected (1.286 seconds) > The below query returns correct results > 0: jdbc:drill:> select columns[4], sum(cast(columns[0] as decimal(11,3))), count(distinct cast(columns[1] as int)), max(cast(columns[2] as bigint)), count(distinct cast(columns[3] as char(9))), max(cast(columns[5] as timestamp)), min(cast(columns[6] as float)), avg(cast(columns[7] as double)) > . . . . . . . > from `conftest.csv` > . . . . . . . > group by columns[4]; > 209 rows selected (1.129 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)