drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefán Baxter (JIRA) <j...@apache.org>
Subject [jira] [Created] (DRILL-3533) null values in a sub-structure in Parquet returns unexpected/misleading results
Date Tue, 21 Jul 2015 18:20:04 GMT
Stefán Baxter created DRILL-3533:
------------------------------------

             Summary: null values in a sub-structure in Parquet returns unexpected/misleading
results
                 Key: DRILL-3533
                 URL: https://issues.apache.org/jira/browse/DRILL-3533
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 1.1.0
            Reporter: Stefán Baxter
            Assignee: Jinfeng Ni
            Priority: Critical


With this minimal dataset as /tmp/test.json:
{"dimensions":{"adults":"A"}}

select lower(p.dimensions.budgetLevel) as `field1`, lower(p.dimensions.adults) as `field2`
from dfs.tmp.`/test.json` as p;

Returns this:
+---------+---------+
| field1  | field2  |
+---------+---------+
| null    | a       |
+---------+---------+

With the same data as a Parquet file
CREATE TABLE dfs.tmp.`/test` AS SELECT * FROM dfs.tmp.`/test.json`;

The same query:
select lower(p.dimensions.budgetLevel) as `field1`, lower(p.dimensions.adults) as `field2`
from dfs.tmp.`/test/0_0_0.parquet` as p;

Return this:
+---------+---------+
| field1  | field2  |
+---------+---------+
| a       | null    |
+---------+---------+

After some more testing it appears that this has nothing to do with trim. (any non existing
nested-value will be pushed aside)

select p.dimensions.budgetLevel as `field1`, lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test/0_0_0.parquet`
as p;

also returns:
+---------+---------+
| field1  | field2  |
+---------+---------+
| a       | null    |
+---------+---------+




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message