crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Micah Whitacre <>
Subject Re: Mixing Writables and Avros PTypes
Date Tue, 26 Mar 2013 02:15:12 GMT
Is this what you are thinking when you mentioned the method?
(note i realize not really code complete but just as an example)

Its similar to what you suggested for the UUID support enhancement.

or you were really thinking the PType interface would get an "as(...)"
method? that took in the converted PTypeFamily (e.g. destinationFamily)).

I could see how making the code on pastebin more generic to at least
support Avros.toWritables(Class<?>)  would be useful.

All of these seem like good enhancements to me.

On Tue, Mar 19, 2013 at 7:43 PM, Josh Wills <> wrote:
> Hey Micah,
> Agreed that this isn't an ideal state of affairs. I think the trick for
> PTables should be to have the key type determine the value type, and have
> derived types that would either convert the PType automatically (via the
> or some such thing), or convert the Avro type to a
> BytesWritable (for Writable keys) or the Writable type to use the
> Avros.writables functionality. I don't think it would be super-complicated
> to try it out and see if there are any edge cases here that I'm missing.
> J
> On Tue, Mar 19, 2013 at 4:31 PM, Micah Whitacre <>
> wrote:
>> Out of curiosity why can we not mix PTypes between Writables and
>> Avros?  The situation I'm encountering are cases where someone wants
>> to create a PTable<UUID, SomeAvro>.  UUID is hidden behind a
>> UUIDWritable (+ MapFn implementations) and exposed as a PType using
>> Writables.derive(...).  SomeAvro's PType is exposed using
>> Avros.records(SomeAvro.class).
>> If consumers want to create the PTable<UUID, SomeAvro> they can either
>> do Avros.tableOf(UUID.PTYPE, SomeAvros.PTYPE) or
>> Writables.tableOf(UUID.PType, SomeAvros.PTYPE) but that will throw an
>> IllegalArgumentException because in both cases something doesn't
>> extend the correct type.
>> Part of the confusion is that the method signatures for both are just
>> PType and don't document or restrict the allowed types until runtime.
>> Is there a way that we might be missing to make both flavors of PTypes
>> play nicely and be able to create tables, pairs, etc easily?
>> I was kind of hoping most data providers could hide whether they are
>> Avro, Writable, or whatever by just exposing PTypes for consumer to
>> use but now it seems consumer might need to implement twice as many
>> PTypes and wrappers for a mixed pipeline to work well.
>> Thanks,
>> Micah
> --
> Director of Data Science
> Cloudera
> Twitter: @josh_wills

View raw message