drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vitalii Diravka (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (DRILL-3577) Counting nested fields on CTAS-created-parquet file/s reports inaccurate results
Date Mon, 18 Apr 2016 17:11:25 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15246034#comment-15246034
] 

Vitalii Diravka edited comment on DRILL-3577 at 4/18/16 5:10 PM:
-----------------------------------------------------------------

1. Partially fixed in [-DRILL-3551-|https://issues.apache.org/jira/browse/DRILL-3551]
{code}
0: jdbc:drill:zk=local> select count(t.others.other) from dfs.`tmp`.`tp` t;
+---------+
| EXPR$0  |
+---------+
| 20203   |
+---------+
1 row selected (0.165 seconds)
{code}
2. {code}
0: jdbc:drill:zk=local> select count(t.others.additional) from dfs.`tmp`.`tp` t;
Error: UNSUPPORTED_OPERATION ERROR: Streaming aggregate does not support schema changes
{code}
This error can be resolved after fixing the [DRILL-4614|https://issues.apache.org/jira/browse/DRILL-4614]


was (Author: vitalii):
1. Partially fixed in [-DRILL-3551-|https://issues.apache.org/jira/browse/DRILL-3551]
{code}
0: jdbc:drill:zk=local> select count(t.others.other) from dfs.`tmp`.`tp` t;
+---------+
| EXPR$0  |
+---------+
| 20203   |
+---------+
1 row selected (0.165 seconds)
{code}
2. {code}
0: jdbc:drill:zk=local> select count(t.others.additional) from dfs.`tmp`.`tp` t;
Error: UNSUPPORTED_OPERATION ERROR: Streaming aggregate does not support schema changes
{code}
This error can be resolved after fixing the [DRILL-3551|https://issues.apache.org/jira/browse/DRILL-3551]

> Counting nested fields on CTAS-created-parquet file/s reports inaccurate results
> --------------------------------------------------------------------------------
>
>                 Key: DRILL-3577
>                 URL: https://issues.apache.org/jira/browse/DRILL-3577
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill
>    Affects Versions: 1.1.0
>            Reporter: Hanifi Gunes
>            Assignee: Vitalii Diravka
>            Priority: Critical
>             Fix For: 1.7.0
>
>
> I have not tried this at a smaller scale nor on JSON file directly but the following
seems to re-prod the issue
> 1. Create an input file as follows
> 20K rows with the following - 
> {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes"}}
> 200 rows with the following - 
> {"some":"yes","others":{"other":"true","all":"false","sometimes":"yes","additional":"last
> entries only"}}
> 2. CTAS as follows
> {code:sql}
> CREATE TABLE dfs.`tmp`.`tp` as select * from dfs.`data.json` t
> {code}
> This should read
> {code}
> Fragment Number of records written
> 0_0	20200
> {code}
> 3. Count on nested fields via
> {code:sql}
> select count(t.others.additional) from dfs.`tmp`.`tp` t
> OR
> select count(t.others.other) from dfs.`tmp`.`tp` t
> {code}
> reports no rows as follows
> {code}
> EXPR$0
> 0
> {code}
> While
> {code:sql}
> select count(t.`some`) from dfs.`tmp`.`tp` t where t.others.additional is not null
> {code}
> reports expected 200 rows
> {code}
> EXPR$0
> 200
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message