hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: Review request for HBASE-7692: Ordered byte[] serialization
Date Fri, 22 Feb 2013 14:13:45 GMT
You're absolutely correct: this library introduces client-side conventions
and is not needed from within the HMaster or RegionServer. Is
the consensus that it should reside in it's own module or be a sibling to
the o.a.h.hbase.client source tree? I'm a little confused by the current
state of the modules; hbase-client looks empty while o.a.h.hbase.client
sits under hbase-server.

Thanks,
Nick

On Thu, Feb 21, 2013 at 11:56 PM, Jonathan Hsieh <jon@cloudera.com> wrote:

> So I buy the argument about this being included in hbase, but several of
> the questions still stand --
>
> Why is this part of hbase-common?  shouldn't this be just a dependency of
> hbase-client module?  Does the hbase-server side need to depend on this?
>
> Since this is a large import of a currently isolated library, why not make
> it a separate module instead of part of hbase-common?  This would enforce a
> boundary that will prevent pollution from circular dependencies.
>
> Jon.
>
> On Thu, Feb 21, 2013 at 7:23 PM, Enis Söztutar <enis@apache.org> wrote:
>
> > I think this belongs in core HBase, as a replacement to Bytes, which
> should
> > be deprecated eventually. We have a Bytes utility which is supposed to
> > convert basic java types to byte[]'s, but it does not work for signed
> > numbers.
> >
> > We already know that all of the clients, Hive, Pig, Phoenix, have to have
> > at least java type -> byte[] conversion utilities, and I think it is
> > HBase's job to supply one so that different clients can interoperate.
> Since
> > internally we are also relying on serializing java types, we need that
> > library in the core.
> >
> > BTW, I also think that we need to have a SQL-type to java type to byte[]
> > layer, but that is another discussion.
> >
> > Enis
> >
> >
> > On Thu, Feb 21, 2013 at 3:04 PM, Jonathan Hsieh <jon@cloudera.com>
> wrote:
> >
> > > Nick,
> > >
> > > While I believe having an order-preserving canonical serialization is a
> > > good idea,  from doing a read of the mail and a skim of the jira it is
> > not
> > > clear to my why this is inside hbase as part of hbase-common.
> > >
> > > Why isn't this part of a library on top of hbase (a dependency for
> > > Pig/Hive) instead of "inside" hbase?
> > > Can't this functionality be done just from the client level?
> > > What's the end goal hee? Is the goal here to replace the
> Bytes.toBytes(*)
> > > methods to enforced the ordering?
> > > If I HBase has two mutually incompatible encodings "built-in", how
> does a
> > > dev know to use one or the other later on?
> > > If this is essentially a mega import of a library (300k.. yikes) , why
> > not
> > > make it a separate module instead of part of common?
> > >
> > > Jon.
> > >
> > > On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk <ndimiduk@gmail.com>
> > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I'm of the opinion that HBase should provide a mechanism for
> > serializing
> > > > common java types such that the serialized format sorts according the
> > > > the natural ordering of the type. I think many application efforts
> end
> > up
> > > > building a custom, partial implementation of this kind of
> functionality
> > > on
> > > > their own. I think HBase should provide a canonical implementation of
> > > such
> > > > a serialization format so that third-parties can reliably build on
> top
> > of
> > > > HBase. Not just user applications, but other tools like Pig and Hive
> > are
> > > > also enabled. Implementations for
> > > > HIVE-3634<https://issues.apache.org/jira/browse/HIVE-3634>,
> > > > HIVE-2599 <https://issues.apache.org/jira/browse/HIVE-2599>, or
> > > > HIVE-2903<https://issues.apache.org/jira/browse/HIVE-2903>could
be
> > > > compatible with similar features in Pig.
> > > >
> > > > After implementing something similar on multiple occasions, stumbled
> > > across
> > > > the Orderly <https://github.com/ndimiduk/orderly> library. It's
also
> > > > appears to have been adopted by other large projects, including
> > > > Lily<https://github.com/NGDATA/orderly>.
> > > > I've engaged the library's author for some improvements only to find
> > out
> > > > he's now at Google and will no longer be maintaining it. Thus, I
> > propose
> > > we
> > > > take it into HBase.
> > > >
> > > > HBASE-7692 <https://issues.apache.org/jira/browse/HBASE-7692>
> > includes a
> > > > patch that introduces Orderly into hbase-common under the orderly
> > > > namespace. I have an associated branch on
> > > > gihub<
> > > https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization
> > > > >wherein
> > > > I've broken the patch out into multiple commits to ease review.
> > > > Please take a few minutes to give it a look.
> > > >
> > > > Thanks,
> > > > Nick
> > > >
> > >
> > >
> > >
> > > --
> > > // Jonathan Hsieh (shay)
> > > // Software Engineer, Cloudera
> > > // jon@cloudera.com
> > >
> >
>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message