avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sam Groth <sgr...@yahoo-inc.com>
Subject Re: is this an appropirate Avro use case?
Date Wed, 11 May 2016 16:11:48 GMT
So there are 2 possible cases that I see: 1) You are able to get the data producer to switch
to Avro using type int/double for the number fields. Then they would be forced to follow the
types in the schema. 2) You write a data cleansing layer to fix inconsistencies and handle
schema changes. In this case, I don't see any advantage to using Avro.


    On Wednesday, May 11, 2016 10:49 AM, Bob Wakefield <adaryl.wakefield@hotmail.com>

 If I’ve been following properly it sounds like while the schema change would be handled,
data cleansing would still have to be coded. I was thinking of converting from CSV to Avro
but then I’d have to convert back to CSV to shove it into the database. I’m not opposed
to doing that, I just don’t think it solves my problem with the negative numbers data type
issue unless Avro understands (200) = –200. Adaryl "Bob" Wakefield, MBA
Mass Street Analytics, LLC
Twitter: @BobLovesData From: kppublicmail . Sent: Wednesday, May 11, 2016 10:35 AMTo: user@avro.apache.org
Subject: Re: is this an appropirate Avro use case? One another option is to convert CSV file
to avro before being consumed.Thanks.On May 9, 2016 8:58 PM, "Sean Busbey" <busbey@cloudera.com>

 On Mon, May 9, 2016 at 12:21 PM, Koert Kuipers <koert@tresata.com> wrote:
> you cannot use avro to ensure the data comes in the format you expect (the
> negative numbers issue). you will have to parse these variations before
> converting to avro.

Unless, of course, you can get the folks sending you data to agree to
send it in Avro. If you specifically get them to send the numbers
coded as one of the number types in Avro (rather than i.e. a string),
you'd be able to parse it the same way all of the time.


View raw message