hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@gmail.com>
Subject Re: Review request for HBASE-7692: Ordered byte[] serialization
Date Fri, 22 Feb 2013 18:04:55 GMT
Inline.

On Fri, Feb 22, 2013 at 10:00 AM, Matt Corgan <mcorgan@hotpads.com> wrote:

> To nitpick a little it wouldn't quite be a sibling of hbase-client because
> hbase-client depends on hbase-common and hbase-protocol while this new one
> will not depend on anything.  Would hbase-server be able to see it?  Would
> it basically be a standalone module being maintained by HBase?
>

Not quite true. It makes use of Bytes and ImmutableBytesWritable from
hbase-common.

Also, assuming the original Orderly library goes unmaintained and we want
> people to use it, this will be the primary place to get it.  Having no
> dependencies on other hbase modules is important for people who want to use
> the Orderly library for something unrelated to hbase.  For example, a web
> application that logs data in this format but not directly to hbase.
>

Orderly has gone unmaintained. The only fork with any activity that I'm
aware of is my own. I'd much rather see it gain the publicity,
additional scrutiny, wider adoption than continue as a pet-project.

On Fri, Feb 22, 2013 at 9:32 AM, Elliott Clark <eclark@apache.org> wrote:
>
> > Yep the client will be fully separated as soon as rpc changes
> > are stabilized.  Until then keeping up the move patch was just too
> onerous.
> >
> >
> > On Fri, Feb 22, 2013 at 6:31 AM, Jonathan Hsieh <jon@cloudera.com>
> wrote:
> >
> > > Nick,
> > >
> > > I'm +1 for it having its own module, and being a sibling of
> hbase-client.
> > >  I'm assuming the client stuff will happen before we release 0.96 since
> > it
> > > has been started.
> > >
> > > Jon.
> > >
> > > On Fri, Feb 22, 2013 at 6:13 AM, Nick Dimiduk <ndimiduk@gmail.com>
> > wrote:
> > >
> > > > You're absolutely correct: this library introduces client-side
> > > conventions
> > > > and is not needed from within the HMaster or RegionServer. Is
> > > > the consensus that it should reside in it's own module or be a
> sibling
> > to
> > > > the o.a.h.hbase.client source tree? I'm a little confused by the
> > current
> > > > state of the modules; hbase-client looks empty while
> o.a.h.hbase.client
> > > > sits under hbase-server.
> > > >
> > > > Thanks,
> > > > Nick
> > > >
> > > > On Thu, Feb 21, 2013 at 11:56 PM, Jonathan Hsieh <jon@cloudera.com>
> > > wrote:
> > > >
> > > > > So I buy the argument about this being included in hbase, but
> several
> > > of
> > > > > the questions still stand --
> > > > >
> > > > > Why is this part of hbase-common?  shouldn't this be just a
> > dependency
> > > of
> > > > > hbase-client module?  Does the hbase-server side need to depend on
> > > this?
> > > > >
> > > > > Since this is a large import of a currently isolated library, why
> not
> > > > make
> > > > > it a separate module instead of part of hbase-common?  This would
> > > > enforce a
> > > > > boundary that will prevent pollution from circular dependencies.
> > > > >
> > > > > Jon.
> > > > >
> > > > > On Thu, Feb 21, 2013 at 7:23 PM, Enis Söztutar <enis@apache.org>
> > > wrote:
> > > > >
> > > > > > I think this belongs in core HBase, as a replacement to Bytes,
> > which
> > > > > should
> > > > > > be deprecated eventually. We have a Bytes utility which is
> supposed
> > > to
> > > > > > convert basic java types to byte[]'s, but it does not work for
> > signed
> > > > > > numbers.
> > > > > >
> > > > > > We already know that all of the clients, Hive, Pig, Phoenix,
have
> > to
> > > > have
> > > > > > at least java type -> byte[] conversion utilities, and I
think it
> > is
> > > > > > HBase's job to supply one so that different clients can
> > interoperate.
> > > > > Since
> > > > > > internally we are also relying on serializing java types, we
need
> > > that
> > > > > > library in the core.
> > > > > >
> > > > > > BTW, I also think that we need to have a SQL-type to java type
to
> > > > byte[]
> > > > > > layer, but that is another discussion.
> > > > > >
> > > > > > Enis
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 21, 2013 at 3:04 PM, Jonathan Hsieh <
> jon@cloudera.com>
> > > > > wrote:
> > > > > >
> > > > > > > Nick,
> > > > > > >
> > > > > > > While I believe having an order-preserving canonical
> > serialization
> > > > is a
> > > > > > > good idea,  from doing a read of the mail and a skim of
the
> jira
> > it
> > > > is
> > > > > > not
> > > > > > > clear to my why this is inside hbase as part of hbase-common.
> > > > > > >
> > > > > > > Why isn't this part of a library on top of hbase (a dependency
> > for
> > > > > > > Pig/Hive) instead of "inside" hbase?
> > > > > > > Can't this functionality be done just from the client level?
> > > > > > > What's the end goal hee? Is the goal here to replace the
> > > > > Bytes.toBytes(*)
> > > > > > > methods to enforced the ordering?
> > > > > > > If I HBase has two mutually incompatible encodings "built-in",
> > how
> > > > > does a
> > > > > > > dev know to use one or the other later on?
> > > > > > > If this is essentially a mega import of a library (300k..
> yikes)
> > ,
> > > > why
> > > > > > not
> > > > > > > make it a separate module instead of part of common?
> > > > > > >
> > > > > > > Jon.
> > > > > > >
> > > > > > > On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk <
> > ndimiduk@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I'm of the opinion that HBase should provide a mechanism
for
> > > > > > serializing
> > > > > > > > common java types such that the serialized format
sorts
> > according
> > > > the
> > > > > > > > the natural ordering of the type. I think many application
> > > efforts
> > > > > end
> > > > > > up
> > > > > > > > building a custom, partial implementation of this
kind of
> > > > > functionality
> > > > > > > on
> > > > > > > > their own. I think HBase should provide a canonical
> > > implementation
> > > > of
> > > > > > > such
> > > > > > > > a serialization format so that third-parties can reliably
> build
> > > on
> > > > > top
> > > > > > of
> > > > > > > > HBase. Not just user applications, but other tools
like Pig
> and
> > > > Hive
> > > > > > are
> > > > > > > > also enabled. Implementations for
> > > > > > > > HIVE-3634<https://issues.apache.org/jira/browse/HIVE-3634>,
> > > > > > > > HIVE-2599 <https://issues.apache.org/jira/browse/HIVE-2599>,
> > or
> > > > > > > > HIVE-2903<https://issues.apache.org/jira/browse/HIVE-2903
> > >could
> > > be
> > > > > > > > compatible with similar features in Pig.
> > > > > > > >
> > > > > > > > After implementing something similar on multiple occasions,
> > > > stumbled
> > > > > > > across
> > > > > > > > the Orderly <https://github.com/ndimiduk/orderly>
library.
> > It's
> > > > also
> > > > > > > > appears to have been adopted by other large projects,
> including
> > > > > > > > Lily<https://github.com/NGDATA/orderly>.
> > > > > > > > I've engaged the library's author for some improvements
only
> to
> > > > find
> > > > > > out
> > > > > > > > he's now at Google and will no longer be maintaining
it.
> Thus,
> > I
> > > > > > propose
> > > > > > > we
> > > > > > > > take it into HBase.
> > > > > > > >
> > > > > > > > HBASE-7692 <https://issues.apache.org/jira/browse/HBASE-7692
> >
> > > > > > includes a
> > > > > > > > patch that introduces Orderly into hbase-common under
the
> > orderly
> > > > > > > > namespace. I have an associated branch on
> > > > > > > > gihub<
> > > > > > >
> > > https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization
> > > > > > > > >wherein
> > > > > > > > I've broken the patch out into multiple commits to
ease
> review.
> > > > > > > > Please take a few minutes to give it a look.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Nick
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > // Jonathan Hsieh (shay)
> > > > > > > // Software Engineer, Cloudera
> > > > > > > // jon@cloudera.com
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > // Jonathan Hsieh (shay)
> > > > > // Software Engineer, Cloudera
> > > > > // jon@cloudera.com
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > // Jonathan Hsieh (shay)
> > > // Software Engineer, Cloudera
> > > // jon@cloudera.com
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message