crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Reid <gabriel.r...@gmail.com>
Subject Re: Thoughts on supporting HBase 0.96
Date Wed, 16 Oct 2013 15:02:50 GMT
On Wed, Oct 16, 2013 at 4:34 PM, Josh Wills <jwills@cloudera.com> wrote:

> On Wed, Oct 16, 2013 at 12:15 AM, Gabriel Reid <gabriel.reid@gmail.com
> >wrote:
>
> > Wouldn't a derived PType (like in o.a.c.types.PTypes) be a better fit
> here?
> >
>
> That was my initial attempt, and in an ideal world, my preferred solution--
> but I haven't figured out how to make it work. The question here is: what
> do I derive a KeyValue object to? What I really want, for purposes of
> reading it/writing it to one of our HBase IO formats, is to map it to
> itself, and not some subclass of Writable. Another option might be an
> extension of WritableType to handle these special case formats-- I'll take
> a crack at getting that to work.
>

I'm sure I'm just missing something obvious, but I don't totally get it.
What I had
in my head is that KeyValue, Put, Delete, Result, etc could all be derived
to byte
arrays, with the KeyValueSerialization, MutationSerialization, and
ResultSerialization
classes being used in the MapFns within the derived PType to go between the
type and its byte representation, i.e.

   public static PType<KeyValue> keyValue(PTypeFamily ptf) {
      return ptf.derived(
         KeyValue.class,
         BYTES_TO_KEYVALUE_VIA_KVSERIALIZATION,
         KEYVALUE_TO_BYTES_VIA_KVSERIALIZATION,
         ptf.bytes());
   }

I'm guessing this is the same thing you're talking about, which I assume
means that
I'm missing something simple as to why that wouldn't just work, but I'm not
sure
what it is that I'm missing.



>
>
> > A whole new PTypeFamily sounds like a lot of work (unless maybe if it
> was a
> > subclass of one of the existing ones), and I think there's still a fair
> bit
> > of code
> > that assumes that Avro & Writable are the only two possible PTypeFamily
> > implementations.
> >
>
> For any kind of intermediate processing, that is still true. The
> HBaseTypeFamily would only ever really appear at the input or output for a
> job.
>
>
True, although of course it would be nice if we wouldn't have that
limitation.

- Gabriel

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message