hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Hsieh <...@cloudera.com>
Subject Re: [DISCUSSION] "Convenience binaries" bundling Phoenix
Date Wed, 18 Mar 2015 00:47:03 GMT
I like Nick's approach of including a hbase (and its deps) inside of
phoenix releases or having the dockerfile with the components "installed".
This coupling seems more easy to manage since phoenix already has two
branches for 0.94 and 0.98 support -- each could include its own hbase and
choose to upgrade point versions or minor versions without introducing
confusion.  That approach is a clean way to deal with semvar breaking
dependencies in the other hadoop/hbase deps discussion (vs the
hadoop1-hadoop2 compat stuff we had before).

Only having phoenix binaries in the 0.98 branch may cause confusion.  It
would be a special case and break the new features in trunk convention and
if extended could potentially block releases of newer versions.

If we kept the policy intact and Include phoenix in trunk/master (an notion
that should rightfully be avoided), we would cause problems if phoenix
breaking API changes were introduced.  It brings in other awkward questions
such as how often would we pull in the latest phoenix? are we willing to
tolerate a  broken master build (we sort of do already admittedly but that
is not ideal) ?  would phoenix be able block a core hbase release?

Are there examples of this kind of "reverse" inclusion in other projects?
One that seems analogous is curator to zookeeper -- and curator is a
separate project from zookeeper.

What if other projects were considered for this special treatment?
Projects like cask and tephra have a large overlap of hbase community
members as well.  Would we have to have criteria to determine how/when to
include those project as well?

Keeping the already large hbase project's scope and code base focused and
independent of new circular dependencies seems prudent.

Jon.


On Tue, Mar 17, 2015 at 12:54 PM, Nick Dimiduk <ndimiduk@gmail.com> wrote:

> I've been thinking of something along these lines as well. Rather an either
> official Apache project, I was thinking it could be something as simple as
> a github managed dockerfile that stands up a HBase + Phoenix singlenode
> deal, see if momentum builds.
>
> Another idea is Phoenix could include HBase in its binary release, the same
> way HBase includes Hadoop. That way there's an "out of the box"
> distribution for Phoenix. That would be a discussion for the Phoenix dev
> list.
>
> -n
>
> On Tuesday, March 17, 2015, Andrew Purtell <apurtell@apache.org> wrote:
>
> > Consider if the HBase project starts releasing new "convenience
> binaries",
> > in addition to the existing ones, in which we bundle a
> recent/vetted/stable
> > version of Phoenix, with the site file changes for loading their
> > coprocessors already patched in (to hbase-default.xml) For now this would
> > be done for 0.98 only, since that's the only release line supported by an
> > actively developed Phoenix version. We could also do this for 0.94
> releases
> > with Phoenix 3 if the 0.94 RM wants, but I doubt there would be any
> demand
> > for this, Phoenix 3 is inactive because that community has all moved to
> 4,
> > I'd imagine that carries over here.
> >
> > Advantages:
> >
> > - HBase would ship with a SQL access option. There's the Phoenix JDBC
> > driver of course, and we'd also bundle the psql and sqlline exec wrappers
> > from the Phoenix binary distribution. We'd have both the jruby shell and
> a
> > SQL shell, this is a powerful combination.
> >
> > - HBase ships with a library that assists users in making efficient
> queries
> > if their data is typed, but this doesn't include the server side
> > optimizations that the Phoenix coprocessors provide, and in that case no
> > hand rolling is necessary.
> >
> > - HBase would ship with secondary indexes. These would not cover all
> > possible use cases and requirements, let's stipulate that now and hope
> this
> > doesn't kick off another circular discussion on that front.
> Unquestionably
> > this is a compelling Phoenix feature so some use cases obviously can
> > benefit, and if users find the combined distribution useful enough we
> don't
> > have to discuss secondary indexes in HBase core again.
> >
> > - We will have done the necessary integration work for the combined
> result
> > to be easy to use. Apache software cat herders will appreciate this.
> >
> > - It's totally optional, simply ignore the new binary packages if you
> don't
> > care. This is not a Grand Unification proposal.
> >
> > Concerns:
> >
> > - More work for the RM. Unquestionably.
> >
> > - Concerns about the quality of the combined convenience artifact: Is
> there
> > an implied warranty? Could we disclaim? Should we disclaim? If not, how
> > does HBase do QA on this. Related to the above concern about RM
> bandwidth.
> > Maybe Phoenix could help.
> >
> > - Increased coupling between the projects. Frankly, I think this already
> > there, we just don't see it until we trip over issues that could have
> been
> > avoided with more communication between projects. Pushing on Phoenix for
> > bits for a monthly HBase release cadence will surface issues faster and
> > improve communication between the projects. This benefits Phoenix with
> more
> > QA bandwidth. This benefits HBase because we see Phoenix bringing in a
> > significant number of users.
> >
> > - We may want to revisit again normalizing type support in HBase's client
> > library and Phoenix, eventually.
> >
> > I could add more items to the advantage or concern lists but mainly want
> to
> > float the idea for feedback at this time.
> >
> > Thoughts?
> >
> > --
> > Best regards,
> >
> >    - Andy
> >
> > Problems worthy of attack prove their worth by hitting back. - Piet Hein
> > (via Tom White)
> >
>



-- 
// Jonathan Hsieh (shay)
// HBase Tech Lead, Software Engineer, Cloudera
// jon@cloudera.com // @jmhsieh

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message