hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: [DISCUSSION] "Convenience binaries" bundling Phoenix
Date Wed, 18 Mar 2015 18:01:51 GMT
Yes, both HBase and Phoenix are packaged by Bigtop (I did the latter work),
but this is largely orthogonal to the question of whether HBase should ship
'convenience binaries' including a SQL shell, except where I hear "let's
not do that and point people to alternatives like Bigtop". Am I stating
this position correctly? If not, please pardon. If so, thanks for the
feedback.

I don't think we want a 'contrib'. We had one of those a long time ago and
got rid of it. Note that I'm not suggesting we bring Phoenix *into* HBase.
This also goes to Jon's comment about circular dependencies. There are no
dependencies at all introduced here. This would be post-build packaging of
a convenience binary only.

I was hoping to sidestep semvar and multi-branching multi-versioning
concerns by making this about convenience binaries only. By definition it's
something done for the convenience of users on a best-effort basis. It
doesn't have to cover all of our build combinatorics. Otherwise neither we
nor the Phoenix project are going to be the position to always have version
X to cover release Y of the other. We'd throw our hands up and leave the
actual integration of Phoenix with HBase to the users themselves, or to
Bigtop, or to commercial vendors. I don't think we need to hamstring
ourselves like that, but if the community thinks the 'convenience binary'
distinction doesn't matter, or doesn't matter *enough*, then ok, no need to
discuss this further. Is that the case?

> What if other projects were considered for this special treatment?
> Projects like cask and tephra have a large overlap of hbase community
> members as well.  Would we have to have criteria to determine how/when to
> include those project as well?

Do we need a criteria in place to cover every related contingency whenever
something is proposed? Has there been a problem getting consensus as needed
on an ad hoc basis up to now? For all the good that surely will come of it,
I think we have also opened the door to a legislative itch with semvar. I
fear we will risk a constitutional crisis whenever Hadoop upgrades a
dependency, but there *I* go conflating things. (smile) You can strike that
as not related to the matter at hand.


On Tue, Mar 17, 2015 at 7:26 PM, Sean Busbey <busbey@cloudera.com> wrote:

> I like the idea of having an out of the box solution for using phoenix on
> top of hbase, but I worry about the conflict when folks want to upgrade one
> or the other. Our instructions for replacing Hadoop jars will get
> substantially more complicated if they have to include phoenix and its
> dependencies.
>
> Two possible compromise positions:
>
> * Apache Bigtop - it already integrates the rest of the stack. My apologies
> if it's in there (or proposed and rejected), phone access limits my ability
> to check.
>
> * Sub-Project - either hbase or phoenix could start a contrib repo that did
> this out of the box combined distro. It could also try to help other
> on-ramping problems, like setting up a cluster without having to manage
> your own deployment of HDFS / ZK.
>
> As the subproject matures we'd have a lower risk way of assessing how
> coupled hbase and phoenix releases are and what kind of deployment
> efficiencies we get.
>
> --
> Sean
> On Mar 17, 2015 7:47 PM, "Jonathan Hsieh" <jon@cloudera.com> wrote:
>
> > I like Nick's approach of including a hbase (and its deps) inside of
> > phoenix releases or having the dockerfile with the components
> "installed".
> > This coupling seems more easy to manage since phoenix already has two
> > branches for 0.94 and 0.98 support -- each could include its own hbase
> and
> > choose to upgrade point versions or minor versions without introducing
> > confusion.  That approach is a clean way to deal with semvar breaking
> > dependencies in the other hadoop/hbase deps discussion (vs the
> > hadoop1-hadoop2 compat stuff we had before).
> >
> > Only having phoenix binaries in the 0.98 branch may cause confusion.  It
> > would be a special case and break the new features in trunk convention
> and
> > if extended could potentially block releases of newer versions.
> >
> > If we kept the policy intact and Include phoenix in trunk/master (an
> notion
> > that should rightfully be avoided), we would cause problems if phoenix
> > breaking API changes were introduced.  It brings in other awkward
> questions
> > such as how often would we pull in the latest phoenix? are we willing to
> > tolerate a  broken master build (we sort of do already admittedly but
> that
> > is not ideal) ?  would phoenix be able block a core hbase release?
> >
> > Are there examples of this kind of "reverse" inclusion in other projects?
> > One that seems analogous is curator to zookeeper -- and curator is a
> > separate project from zookeeper.
> >
> > What if other projects were considered for this special treatment?
> > Projects like cask and tephra have a large overlap of hbase community
> > members as well.  Would we have to have criteria to determine how/when to
> > include those project as well?
> >
> > Keeping the already large hbase project's scope and code base focused and
> > independent of new circular dependencies seems prudent.
> >
> > Jon.
> >
> >
> > On Tue, Mar 17, 2015 at 12:54 PM, Nick Dimiduk <ndimiduk@gmail.com>
> wrote:
> >
> > > I've been thinking of something along these lines as well. Rather an
> > either
> > > official Apache project, I was thinking it could be something as simple
> > as
> > > a github managed dockerfile that stands up a HBase + Phoenix singlenode
> > > deal, see if momentum builds.
> > >
> > > Another idea is Phoenix could include HBase in its binary release, the
> > same
> > > way HBase includes Hadoop. That way there's an "out of the box"
> > > distribution for Phoenix. That would be a discussion for the Phoenix
> dev
> > > list.
> > >
> > > -n
> > >
> > > On Tuesday, March 17, 2015, Andrew Purtell <apurtell@apache.org>
> wrote:
> > >
> > > > Consider if the HBase project starts releasing new "convenience
> > > binaries",
> > > > in addition to the existing ones, in which we bundle a
> > > recent/vetted/stable
> > > > version of Phoenix, with the site file changes for loading their
> > > > coprocessors already patched in (to hbase-default.xml) For now this
> > would
> > > > be done for 0.98 only, since that's the only release line supported
> by
> > an
> > > > actively developed Phoenix version. We could also do this for 0.94
> > > releases
> > > > with Phoenix 3 if the 0.94 RM wants, but I doubt there would be any
> > > demand
> > > > for this, Phoenix 3 is inactive because that community has all moved
> to
> > > 4,
> > > > I'd imagine that carries over here.
> > > >
> > > > Advantages:
> > > >
> > > > - HBase would ship with a SQL access option. There's the Phoenix JDBC
> > > > driver of course, and we'd also bundle the psql and sqlline exec
> > wrappers
> > > > from the Phoenix binary distribution. We'd have both the jruby shell
> > and
> > > a
> > > > SQL shell, this is a powerful combination.
> > > >
> > > > - HBase ships with a library that assists users in making efficient
> > > queries
> > > > if their data is typed, but this doesn't include the server side
> > > > optimizations that the Phoenix coprocessors provide, and in that case
> > no
> > > > hand rolling is necessary.
> > > >
> > > > - HBase would ship with secondary indexes. These would not cover all
> > > > possible use cases and requirements, let's stipulate that now and
> hope
> > > this
> > > > doesn't kick off another circular discussion on that front.
> > > Unquestionably
> > > > this is a compelling Phoenix feature so some use cases obviously can
> > > > benefit, and if users find the combined distribution useful enough we
> > > don't
> > > > have to discuss secondary indexes in HBase core again.
> > > >
> > > > - We will have done the necessary integration work for the combined
> > > result
> > > > to be easy to use. Apache software cat herders will appreciate this.
> > > >
> > > > - It's totally optional, simply ignore the new binary packages if you
> > > don't
> > > > care. This is not a Grand Unification proposal.
> > > >
> > > > Concerns:
> > > >
> > > > - More work for the RM. Unquestionably.
> > > >
> > > > - Concerns about the quality of the combined convenience artifact: Is
> > > there
> > > > an implied warranty? Could we disclaim? Should we disclaim? If not,
> how
> > > > does HBase do QA on this. Related to the above concern about RM
> > > bandwidth.
> > > > Maybe Phoenix could help.
> > > >
> > > > - Increased coupling between the projects. Frankly, I think this
> > already
> > > > there, we just don't see it until we trip over issues that could have
> > > been
> > > > avoided with more communication between projects. Pushing on Phoenix
> > for
> > > > bits for a monthly HBase release cadence will surface issues faster
> and
> > > > improve communication between the projects. This benefits Phoenix
> with
> > > more
> > > > QA bandwidth. This benefits HBase because we see Phoenix bringing in
> a
> > > > significant number of users.
> > > >
> > > > - We may want to revisit again normalizing type support in HBase's
> > client
> > > > library and Phoenix, eventually.
> > > >
> > > > I could add more items to the advantage or concern lists but mainly
> > want
> > > to
> > > > float the idea for feedback at this time.
> > > >
> > > > Thoughts?
> > > >
> > > > --
> > > > Best regards,
> > > >
> > > >    - Andy
> > > >
> > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > > > (via Tom White)
> > > >
> > >
> >
> >
> >
> > --
> > // Jonathan Hsieh (shay)
> > // HBase Tech Lead, Software Engineer, Cloudera
> > // jon@cloudera.com // @jmhsieh
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message