hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted <yuzhih...@gmail.com>
Subject Re: Review request for HBASE-7692: Ordered byte[] serialization
Date Fri, 22 Feb 2013 14:21:15 GMT
Elliot is working on making hbase-client module concrete in hbase-7012. 

Cheers

On Feb 22, 2013, at 6:13 AM, Nick Dimiduk <ndimiduk@gmail.com> wrote:

> You're absolutely correct: this library introduces client-side conventions
> and is not needed from within the HMaster or RegionServer. Is
> the consensus that it should reside in it's own module or be a sibling to
> the o.a.h.hbase.client source tree? I'm a little confused by the current
> state of the modules; hbase-client looks empty while o.a.h.hbase.client
> sits under hbase-server.
> 
> Thanks,
> Nick
> 
> On Thu, Feb 21, 2013 at 11:56 PM, Jonathan Hsieh <jon@cloudera.com> wrote:
> 
>> So I buy the argument about this being included in hbase, but several of
>> the questions still stand --
>> 
>> Why is this part of hbase-common?  shouldn't this be just a dependency of
>> hbase-client module?  Does the hbase-server side need to depend on this?
>> 
>> Since this is a large import of a currently isolated library, why not make
>> it a separate module instead of part of hbase-common?  This would enforce a
>> boundary that will prevent pollution from circular dependencies.
>> 
>> Jon.
>> 
>> On Thu, Feb 21, 2013 at 7:23 PM, Enis Söztutar <enis@apache.org> wrote:
>> 
>>> I think this belongs in core HBase, as a replacement to Bytes, which
>> should
>>> be deprecated eventually. We have a Bytes utility which is supposed to
>>> convert basic java types to byte[]'s, but it does not work for signed
>>> numbers.
>>> 
>>> We already know that all of the clients, Hive, Pig, Phoenix, have to have
>>> at least java type -> byte[] conversion utilities, and I think it is
>>> HBase's job to supply one so that different clients can interoperate.
>> Since
>>> internally we are also relying on serializing java types, we need that
>>> library in the core.
>>> 
>>> BTW, I also think that we need to have a SQL-type to java type to byte[]
>>> layer, but that is another discussion.
>>> 
>>> Enis
>>> 
>>> 
>>> On Thu, Feb 21, 2013 at 3:04 PM, Jonathan Hsieh <jon@cloudera.com>
>> wrote:
>>> 
>>>> Nick,
>>>> 
>>>> While I believe having an order-preserving canonical serialization is a
>>>> good idea,  from doing a read of the mail and a skim of the jira it is
>>> not
>>>> clear to my why this is inside hbase as part of hbase-common.
>>>> 
>>>> Why isn't this part of a library on top of hbase (a dependency for
>>>> Pig/Hive) instead of "inside" hbase?
>>>> Can't this functionality be done just from the client level?
>>>> What's the end goal hee? Is the goal here to replace the
>> Bytes.toBytes(*)
>>>> methods to enforced the ordering?
>>>> If I HBase has two mutually incompatible encodings "built-in", how
>> does a
>>>> dev know to use one or the other later on?
>>>> If this is essentially a mega import of a library (300k.. yikes) , why
>>> not
>>>> make it a separate module instead of part of common?
>>>> 
>>>> Jon.
>>>> 
>>>> On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk <ndimiduk@gmail.com>
>>> wrote:
>>>> 
>>>>> Hi everyone,
>>>>> 
>>>>> I'm of the opinion that HBase should provide a mechanism for
>>> serializing
>>>>> common java types such that the serialized format sorts according the
>>>>> the natural ordering of the type. I think many application efforts
>> end
>>> up
>>>>> building a custom, partial implementation of this kind of
>> functionality
>>>> on
>>>>> their own. I think HBase should provide a canonical implementation of
>>>> such
>>>>> a serialization format so that third-parties can reliably build on
>> top
>>> of
>>>>> HBase. Not just user applications, but other tools like Pig and Hive
>>> are
>>>>> also enabled. Implementations for
>>>>> HIVE-3634<https://issues.apache.org/jira/browse/HIVE-3634>,
>>>>> HIVE-2599 <https://issues.apache.org/jira/browse/HIVE-2599>, or
>>>>> HIVE-2903<https://issues.apache.org/jira/browse/HIVE-2903>could
be
>>>>> compatible with similar features in Pig.
>>>>> 
>>>>> After implementing something similar on multiple occasions, stumbled
>>>> across
>>>>> the Orderly <https://github.com/ndimiduk/orderly> library. It's
also
>>>>> appears to have been adopted by other large projects, including
>>>>> Lily<https://github.com/NGDATA/orderly>.
>>>>> I've engaged the library's author for some improvements only to find
>>> out
>>>>> he's now at Google and will no longer be maintaining it. Thus, I
>>> propose
>>>> we
>>>>> take it into HBase.
>>>>> 
>>>>> HBASE-7692 <https://issues.apache.org/jira/browse/HBASE-7692>
>>> includes a
>>>>> patch that introduces Orderly into hbase-common under the orderly
>>>>> namespace. I have an associated branch on
>>>>> gihub<
>>>> https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization
>>>>>> wherein
>>>>> I've broken the patch out into multiple commits to ease review.
>>>>> Please take a few minutes to give it a look.
>>>>> 
>>>>> Thanks,
>>>>> Nick
>>>> 
>>>> 
>>>> 
>>>> --
>>>> // Jonathan Hsieh (shay)
>>>> // Software Engineer, Cloudera
>>>> // jon@cloudera.com
>> 
>> 
>> 
>> --
>> // Jonathan Hsieh (shay)
>> // Software Engineer, Cloudera
>> // jon@cloudera.com
>> 

Mime
View raw message