avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pritchard, Charles X. -ND" <Charles.X.Pritchard....@disney.com>
Subject Re: Handling field names when serializing and deserializing JSON
Date Tue, 14 Jan 2014 22:45:56 GMT

On Jan 14, 2014, at 2:32 PM, Doug Cutting <cutting@apache.org> wrote:

> On Tue, Jan 14, 2014 at 2:06 PM, Pritchard, Charles X. -ND
> <Charles.X.Pritchard.-ND@disney.com> wrote:
>> Do I just pop the “original” field name in as an alias and use the “safe”
>> (alphanumeric+underscore) one as the primary name?
> 
> you have Avro data with names that are illegal in Hive then you could
> provide Hive with a safely-named schema to use when reading these that
> has the original name as an alias.  Conversely, if Hive writes data
> with the "safe" name that you want to read as data using the original
> name, then you'd read with the safe name in an alias.  Does that make
> sense?

Yes; so we just flop the alias/original name between Hive and other sources.
Really appreciate the clarification there.

One of the common places this comes up is with hyphens such as: “X-Something” in some
JSON schemas.

Thanks for letting me know how to handle this.

On that same topic though — if/when someone does something really awful, like using a dot
in the key name,
is that still going to work out fine with record.get() syntax?

e.g.: { “key”: “val”, “dotted.key”: “val” }

I know that in the context of avro aliases, the dot has special semantics.

(I hope I’m not being too obtuse).

-Charles
Mime
View raw message