avro-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pritchard, Charles X. -ND" <Charles.X.Pritchard....@disney.com>
Subject Re: Schema Coersion
Date Fri, 27 Jun 2014 18:55:23 GMT
On Jun 27, 2014, at 11:24 AM, Doug Cutting <cutting@apache.org> wrote:

> On Fri, Jun 27, 2014 at 9:40 AM, Pritchard, Charles X. -ND
> <Charles.X.Pritchard.-ND@disney.com> wrote:
>> is there a means to coerce data written as byte[] into String (and the other way
around) ?
> (Int-to-float and long-to-float are both lossy promotions that we
> already permit.)
> In the meantime, the standard Java APIs won't permit this conversion.
> I posted a patch that adds support for promotion from string to bytes.
> Would this be useful?  If so, please add comments to the issue.
> https://issues.apache.org/jira/browse/AVRO-1533

Thanks, I will post comments. One of the use cases has to do with Hive — there are times
when we want to allow-in
binary data but we’d like to view it as strings for ease of use with the rest of the stack,
as most or all of the data would be UTF-8.

In my experience, in this particular strange situation, when converting from binary to UTF-8,
one would not throw an error but would instead use the UTF-8 characters for out-of-range bytes.
I know there is particular vocabulary for it, but I can’t remember off the top of my head
what that vocab is.

Anyone converting binary to UTF-8 should already be aware that the operation can be lossy.

View raw message