drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jacques Nadeau (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-2869) Incorrect data when we have fields missing in some of the files - another case
Date Tue, 05 May 2015 13:38:02 GMT

     [ https://issues.apache.org/jira/browse/DRILL-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jacques Nadeau updated DRILL-2869:
----------------------------------
    Fix Version/s: 1.2.0

> Incorrect data when we have fields missing in some of the files - another case
> ------------------------------------------------------------------------------
>
>                 Key: DRILL-2869
>                 URL: https://issues.apache.org/jira/browse/DRILL-2869
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators, Storage - JSON, Storage - Parquet
>            Reporter: Rahul Challapalli
>            Assignee: Hanifi Gunes
>            Priority: Critical
>              Labels: erro
>             Fix For: 1.2.0
>
>
> git.commit.id.abbrev=5cd36c5
> Data File1 : a.json
> {code}
> { "c1" : 1, "m1" : {"m2" : {"m3" : {"c2" : 5} } } }
> { "c1" : 2, "m1" : {"m2" : {"m3" : {"c2" : 6} } } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> {code}
> Data File2 : b.json
> {code}
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> {code}
> Data File3 : c.json
> {code}
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> { "c1" : 3, "m1" : {"m2" : {"c2" : 5} } }
> {code}
> The below query reports incorrect results for both json and parquet formats. It returns
empty maps when it should not. This issue is even present when we query equivalent parquet
files
> {code}
> select t.m1.m2 from `delme_repro` as `t`;
> +------------+
> |   EXPR$0   |
> +------------+
> | {"c2":5}   |
> | {"c2":5}   |
> | {"c2":5}   |
> | {"c2":5}   |
> | {"c2":5}   |
> | {"c2":5}   |
> | {}         |
> | {}         |
> | {"c2":5}   |
> +------------+
> {code}
> However if I run the same query on the specific file, I get the correct output
> {code}
> select t.m1.m2 from `delme_repro/a.json` as `t`;
> +------------+
> |   EXPR$0   |
> +------------+
> | {"m3":{"c2":5}} |
> | {"m3":{"c2":6}} |
> | {"m3":{},"c2":5} |
> +------------+
> 3 rows selected (0.113 seconds)
> {code}
> Let me know if you have any questions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message