drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefán Baxter <ste...@activitystream.com>
Subject Re: [jira] [Created] (DRILL-4056) Avro deserialization
Date Wed, 11 Nov 2015 10:47:14 GMT
Hi,

I have a) confirmed this behavior with more data and latest 1.3 anb b)
submitted a test file to the Jira ticket.

This affects all string based data fetched from Avro files (at least for me)

I think this should be considered a blocker for 1.3.

Regards,
 -Stefán


On Tue, Nov 10, 2015 at 2:40 PM, Stefán Baxter (JIRA) <jira@apache.org>
wrote:

> Stefán Baxter created DRILL-4056:
> ------------------------------------
>
>              Summary: Avro deserialization
>                  Key: DRILL-4056
>                  URL: https://issues.apache.org/jira/browse/DRILL-4056
>              Project: Apache Drill
>           Issue Type: Bug
>           Components: Storage - Other
>     Affects Versions: 1.3.0
>          Environment: Ubuntu 15.04 - Oracle Java
>             Reporter: Stefán Baxter
>              Fix For: 1.3.0
>
>
> I have an Avro file that support the following data/schema:
> {"field":"some", "classification":{"variant":"Gæst"}}
>
> When I select 10 rows from this file I get:
> +---------------------+
> |       EXPR$0        |
> +---------------------+
> | Gæst                |
> | Voksen              |
> | Voksen              |
> | Invitation KIF KBH  |
> | Invitation KIF KBH  |
> | Ordinarie pris KBH  |
> | Ordinarie pris KBH  |
> | Biljetter 200 krBH  |
> | Biljetter 200 krBH  |
> | Biljetter 200 krBH  |
> +---------------------+
>
> The bug is that the field values are incorrectly de-serialized and the
> value from the previous row is retained if the subsequent row is shorter.
>
> The sql query:
> "select s.classification.variant variant from dfs.<some> as s limit 10;"
>
> That way the  "Ordinarie pris" becomes "Ordinarie pris KBH" because the
> previous row had the value "Invitation KIF KBH".
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message