crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <jwi...@cloudera.com>
Subject Re: Mixing Writables and Avros PTypes
Date Wed, 20 Mar 2013 00:43:06 GMT
Hey Micah,

Agreed that this isn't an ideal state of affairs. I think the trick for
PTables should be to have the key type determine the value type, and have
derived types that would either convert the PType automatically (via the
PType.as(...) or some such thing), or convert the Avro type to a
BytesWritable (for Writable keys) or the Writable type to use the
Avros.writables functionality. I don't think it would be super-complicated
to try it out and see if there are any edge cases here that I'm missing.

J


On Tue, Mar 19, 2013 at 4:31 PM, Micah Whitacre <mkwhitacre@gmail.com>wrote:

> Out of curiosity why can we not mix PTypes between Writables and
> Avros?  The situation I'm encountering are cases where someone wants
> to create a PTable<UUID, SomeAvro>.  UUID is hidden behind a
> UUIDWritable (+ MapFn implementations) and exposed as a PType using
> Writables.derive(...).  SomeAvro's PType is exposed using
> Avros.records(SomeAvro.class).
>
> If consumers want to create the PTable<UUID, SomeAvro> they can either
> do Avros.tableOf(UUID.PTYPE, SomeAvros.PTYPE) or
> Writables.tableOf(UUID.PType, SomeAvros.PTYPE) but that will throw an
> IllegalArgumentException because in both cases something doesn't
> extend the correct type.
>
> Part of the confusion is that the method signatures for both are just
> PType and don't document or restrict the allowed types until runtime.
> Is there a way that we might be missing to make both flavors of PTypes
> play nicely and be able to create tables, pairs, etc easily?
>
> I was kind of hoping most data providers could hide whether they are
> Avro, Writable, or whatever by just exposing PTypes for consumer to
> use but now it seems consumer might need to implement twice as many
> PTypes and wrappers for a mixed pipeline to work well.
>
> Thanks,
> Micah
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Mime
View raw message