crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <>
Subject Thoughts on supporting HBase 0.96
Date Wed, 16 Oct 2013 05:42:50 GMT
Hey all,

To kill some time this afternoon, I took a pass at figuring out what
changes would be needed in Crunch to support HBase 0.96, which is going
through a few release candidates right now. I started out by building
against the 0.95.2 release, which has most of the API changes that I'm told
we can expect in 0.96.

The most consequential change I found is that many of the core HBase
classes we operate on-- Put, Delete, KeyValue, and Result-- will no longer
implement the Writable interface. Instead, the HBase team has added a
number of SerializationFactory classes for these types, which map the POJO
versions of those objects on to protocol buffers. This means that the
current trick of creating PTypes for HBase like this:

PType<Result> ptype = Writables.writables(Result.class);

won't work anymore in 0.96, i.e., the HBase data classes won't fit into
either of the existing type families.

The best solution I've come up with so far is to create a new,
HBase-specific PTypeFamily for supporting the way these classes are
serialized now. I'm not sure if there's a better approach here and/or how
complex this particular PTypeFamily implementation would need to be; I'm
very much open to ideas on how to proceed here.


Director of Data Science
Cloudera <>
Twitter: @josh_wills <>

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message