drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Engin Sozer (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DRILL-6171) Querying avro files returns null for not null values
Date Tue, 20 Feb 2018 14:35:00 GMT

     [ https://issues.apache.org/jira/browse/DRILL-6171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Engin Sozer updated DRILL-6171:
-------------------------------
    Description: 
Querying an avro file results in incorrect results. (with jdbc/odbc drivers from MapR but
I believe that is the case with standard drivers from drill as well) For example:

We have a file (test.avro') with columns col1, col2, col3 and col4 for a total of 5000 rows.
In the first 4000 rows, col3 is null. In the last 1000 rows, col3 is not null. The issue is
that when we write;
{code:java}
select * from dfs.tmp.`test.avro`;
select col1, col2, col3 from dfs.tmp.`test.avro`;
{code}
col3 is returned null for all 5000 rows. If we write:
{code:java}
select col1, col2, COALESCE(col3, null) from dfs.tmp.`test.avro`;{code}
This returns the correct results. (col3 = null for first 4000 rows and col3= not null for
the next 1000)

  was:
Querying an avro file with a select * statement results in incorrect results. For example:

We have a file (test.avro') with columns col1, col2, col3 and col4 for a total of 5000 rows.
In the first 4000 rows, col3 is null. In the last 1000 rows, col3 is not null. The issue is
that when we write;

 
{code:java}
select * from dfs.tmp.`test.avro`;
select col1, col2, col3 from dfs.tmp.`test.avro`;
{code}
col3 is returned null for all 5000 rows. If we write:
{code:java}
select col1, col2, COALESCE(col3, null) from dfs.tmp.`test.avro`;{code}
This returns the correct results. (col3 = null for first 4000 rows and col3= not null for
the next 1000)

    Component/s: Client - JDBC

> Querying avro files returns null for not null values
> ----------------------------------------------------
>
>                 Key: DRILL-6171
>                 URL: https://issues.apache.org/jira/browse/DRILL-6171
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Client - JDBC, Storage - Avro
>    Affects Versions: 1.10.0
>            Reporter: Engin Sozer
>            Priority: Major
>
> Querying an avro file results in incorrect results. (with jdbc/odbc drivers from MapR
but I believe that is the case with standard drivers from drill as well) For example:
> We have a file (test.avro') with columns col1, col2, col3 and col4 for a total of 5000
rows. In the first 4000 rows, col3 is null. In the last 1000 rows, col3 is not null. The issue
is that when we write;
> {code:java}
> select * from dfs.tmp.`test.avro`;
> select col1, col2, col3 from dfs.tmp.`test.avro`;
> {code}
> col3 is returned null for all 5000 rows. If we write:
> {code:java}
> select col1, col2, COALESCE(col3, null) from dfs.tmp.`test.avro`;{code}
> This returns the correct results. (col3 = null for first 4000 rows and col3= not null
for the next 1000)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message