hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: [UPDATE] Finishing up 0.96 --> WAS Re: 0.95 and 0.96 remaining issues
Date Wed, 31 Jul 2013 20:34:51 GMT
On Wed, Jul 31, 2013 at 1:19 PM, James Taylor <jtaylor@salesforce.com>wrote:

> But the value in your patch is fixing the serialization format such that it
> is order preserving. Unfortunately, without this, Phoenix can't adopt it.
> It's existing type system and query processing is predicated on this.

Two patches, two value propositions. Providing a data type api with some
pre-made implementations that users can use and that external projects can
standardize on is value of itself. Phoenix can extend this API to provide
it's own encodings, but I agree it provides in HBase something Phoenix has
already worked out for itself. The biggest win here is that two consumers
of HBase can agree on precisely what they mean when they say they encode a

The second piece is the order-preserving encoding scheme. Having HBase ship
a single scheme that can be used across the board has much wider utility.
Delivering it through the API described previously is practical. Lacking
this, Phoenix can still plug it's existing encoding code into the data type
API, as I described in another email.

I want to see them both shipped. Breaking it down like this was a way to
allow for prudent concessions considering the timelines.

 On Wed, Jul 31, 2013 at 12:04 PM, Nick Dimiduk <ndimiduk@gmail.com> wrote:
> > On Wed, Jul 31, 2013 at 10:31 AM, Stack <stack@duboce.net> wrote:
> >
> > > So what would be the incentive using the new API be?
> > >
> >
> > Hopefully the new API is nicer than managing byte[]'s on manually. The
> only
> > incentive for users would be keeping up with progress, giving users the
> > chance to start migrating their applications. For the external tools, I'm
> > looking forward to using this to make defining Hive tables over HBase
> > nicer. The current column mapping stuff is clunky and this API gives a
> much
> > improved mechanism for declaring column types. I can't do that without an
> > API shipping with HBase. Maybe Elliott can weigh in on the Imapala side,
> > James on Phoenix, Bueller from Kiji?
> >
> > And then when the implementation changes -- it serializes in sort-order
> --
> > > will it confuse?
> > >
> >
> > Let's continue my Hive example. Assuming DataType (9091 + 8694) ships in
> > 0.96, Hive gets plumbed, and users get to start defining their tables in
> > terms of LegacyInteger, LegacyBytesFixedWidth, and Struct. When the
> > OrderedBytes patch (8201) comes in with it's type implementations, users
> at
> > their leisure can drop the new types in when they're ready to transition.
> > The Ordered* types don't replace the Legacy* types, they augment the
> > catalog of types that HBase provides.
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message