drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefán Baxter (JIRA) <j...@apache.org>
Subject [jira] [Created] (DRILL-4056) Avro deserialization
Date Tue, 10 Nov 2015 14:40:11 GMT
Stefán Baxter created DRILL-4056:
------------------------------------

             Summary: Avro deserialization 
                 Key: DRILL-4056
                 URL: https://issues.apache.org/jira/browse/DRILL-4056
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - Other
    Affects Versions: 1.3.0
         Environment: Ubuntu 15.04 - Oracle Java
            Reporter: Stefán Baxter
             Fix For: 1.3.0


I have an Avro file that support the following data/schema:
{"field":"some", "classification":{"variant":"Gæst"}}

When I select 10 rows from this file I get:
+---------------------+
|       EXPR$0        |
+---------------------+
| Gæst                |
| Voksen              |
| Voksen              |
| Invitation KIF KBH  |
| Invitation KIF KBH  |
| Ordinarie pris KBH  |
| Ordinarie pris KBH  |
| Biljetter 200 krBH  |
| Biljetter 200 krBH  |
| Biljetter 200 krBH  |
+---------------------+

The bug is that the field values are incorrectly de-serialized and the value from the previous
row is retained if the subsequent row is shorter.

The sql query:
"select s.classification.variant variant from dfs.<some> as s limit 10;"

That way the  "Ordinarie pris" becomes "Ordinarie pris KBH" because the previous row had the
value "Invitation KIF KBH".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message