hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: Review request for HBASE-7692: Ordered byte[] serialization
Date Fri, 22 Feb 2013 04:24:30 GMT
I think we have to enable building stuff on top of HBase by having well defined building blocks
as part of HBase.
It seems to me that a canonical supported byte representation for datatypes is such a building

-- Lars

 From: Jonathan Hsieh <jon@cloudera.com>
To: dev@hbase.apache.org 
Sent: Thursday, February 21, 2013 3:04 PM
Subject: Re: Review request for HBASE-7692: Ordered byte[] serialization

While I believe having an order-preserving canonical serialization is a
good idea,  from doing a read of the mail and a skim of the jira it is not
clear to my why this is inside hbase as part of hbase-common.

Why isn't this part of a library on top of hbase (a dependency for
Pig/Hive) instead of "inside" hbase?
Can't this functionality be done just from the client level?
What's the end goal hee? Is the goal here to replace the Bytes.toBytes(*)
methods to enforced the ordering?
If I HBase has two mutually incompatible encodings "built-in", how does a
dev know to use one or the other later on?
If this is essentially a mega import of a library (300k.. yikes) , why not
make it a separate module instead of part of common?


On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk <ndimiduk@gmail.com> wrote:

> Hi everyone,
> I'm of the opinion that HBase should provide a mechanism for serializing
> common java types such that the serialized format sorts according the
> the natural ordering of the type. I think many application efforts end up
> building a custom, partial implementation of this kind of functionality on
> their own. I think HBase should provide a canonical implementation of such
> a serialization format so that third-parties can reliably build on top of
> HBase. Not just user applications, but other tools like Pig and Hive are
> also enabled. Implementations for
> HIVE-3634<https://issues.apache.org/jira/browse/HIVE-3634>,
> HIVE-2599 <https://issues.apache.org/jira/browse/HIVE-2599>, or
> HIVE-2903<https://issues.apache.org/jira/browse/HIVE-2903>could be
> compatible with similar features in Pig.
> After implementing something similar on multiple occasions, stumbled across
> the Orderly <https://github.com/ndimiduk/orderly> library. It's also
> appears to have been adopted by other large projects, including
> Lily<https://github.com/NGDATA/orderly>.
> I've engaged the library's author for some improvements only to find out
> he's now at Google and will no longer be maintaining it. Thus, I propose we
> take it into HBase.
> HBASE-7692 <https://issues.apache.org/jira/browse/HBASE-7692> includes a
> patch that introduces Orderly into hbase-common under the orderly
> namespace. I have an associated branch on
> gihub<https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization
> >wherein
> I've broken the patch out into multiple commits to ease review.
> Please take a few minutes to give it a look.
> Thanks,
> Nick

// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message