hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: [NOTICE] Merge of shaded protobuf 3.1.0 (WAS => [DISCUSS] Shade protobuf so we can move to a newer version)
Date Mon, 03 Oct 2016 21:55:21 GMT
I would not expect Hadoop to shade PB in HDFS in any near time frame. I
don't think we have a good chance of impressing upon them a sense of
urgency based on their spotty track of listening to downstreamer concerns.
Even if we could and they deliver, what about all of the already deployed
2.x? Or 3.x? Would a change like this get into 3? Or would it be pushed to
4? So we wait until Hadoop 3 or maybe 4 is released *and* GA *and* out
there so long that telling people to get off 2 is is finally ok before
moving up to PB3 or getting the off heaping goodness? No, let's get this
work in.




On Mon, Oct 3, 2016 at 2:46 PM, Enis Söztutar <enis@apache.org> wrote:

> Is PB only coming via HDFS client or there are other usages that pollute
> our classpath? We are doing all this gymnastics just because upstream does
> not do shading is frustrating. Any idea for how likely we can push shaded
> hdfs to Hadoop? Should we focus on that instead rather than working around
> the problem?
>
> Enis
>
> On Mon, Oct 3, 2016 at 11:09 AM, Stack <stack@duboce.net> wrote:
>
> > On Mon, Oct 3, 2016 at 9:41 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > The precommit job uses compile-protobuf profile for verification. The
> > > absence of compile-protobuf profile in hbase-protocol-shaded module
> means
> > > precommit job would only invoke the existing compile-protobuf profile
> > > in hbase-protocol
> > > module.
> > >
> > w.r.t. path, when two protoc executables (corresponding to 2.5 and 3.x,
> > > respectively) are available, would maven know which one to pick for
> > > the hbase-protocol and hbase-protocol-shaded modules ?
> > >
> > >
> >
> > For the former, right, nothing changed.
> >
> > On the latter, it is a contrivance, a problem we do not have. The build
> > doesn't have to pick between pb 2.5 vs pb 3.1. We don't run protoc as
> part
> > of our build. It is done apart from the build by the developer and their
> > results are then checked-in. That continues as it was post-patch. We just
> > do 3.1 now instead of 2.5.
> >
> > I'll check in a bit of doc as part of this commit that hopefully will
> help
> > make this all clearer.
> >
> > St.Ack
> >
> >
> >
> >
> >
> > > Cheers
> > >
> > > On Mon, Oct 3, 2016 at 9:32 AM, Stack <stack@duboce.net> wrote:
> > >
> > > > On Mon, Oct 3, 2016 at 7:16 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > >
> > > > > Looks like compile-protobuf profile is not in
> > > > hbase-protocol-shaded/pom.xml
> > > > > (in HBASE-16264 branch)
> > > > >
> > > > >
> > > > Sorry. I don't get what you are saying here
> > > >
> > > > (The target in the new module is generate-sources. See the included
> > > README.
> > > > This step does more work now more than just generating protocs, hence
> > new
> > > > profile name.)
> > > >
> > > >
> > > >
> > > > > Seems precommit jobs should pass with the current formation.
> > > > >
> > > > >
> > > > Are you stating that this patch is likely to build? (Yes, the patch I
> > > > submitted builds).
> > > >
> > > >
> > > >
> > > > > In the future, if we add another profile for compiling proto3
> files,
> > we
> > > > > need to specify the path to proto3 compiler.
> > > > >
> > > > >
> > > >
> > > > > Please correct me if I am wrong.
> > > > >
> > > > >
> > > > I don't know what you are asking. Why do we have to specify 'paths'?
> We
> > > > don't have to currently (See the plugin we use generating protos now
> > from
> > > > hadoop). Maybe you are trying to distinguish the production of
> > > protobuf2.5
> > > > vs 3.1 protos but these are isolated by module....
> > > >
> > > >
> > > > I said I'd commit this morning but let me wait a while. There may be
> > some
> > > > more questions/objections and I'd like to have a clean build up on
> > > jenkins
> > > > here [1] before I commit (jenkins is being ornery).
> > > >
> > > > St.Ack
> > > > 1.
> > > > https://builds.apache.org/job/HBASE-16264/jdk=JDK%201.8%20(
> > > > latest),label=yahoo-not-h2/28/console
> > > >
> > > > Service Unavailable
> > > >
> > > > The server is temporarily unable to service your
> > > > request due to maintenance downtime or capacity
> > > > problems. Please try again later.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > > On Mon, Oct 3, 2016 at 6:58 AM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> > > > >
> > > > > > The protoc generated files (such as MasterProtos) are checked
> into
> > > > source
> > > > > > repo, right ?
> > > > > >
> > > > > > Do we need proto3 on the precommit image(s) ?
> > > > > >
> > > > > > On Mon, Oct 3, 2016 at 5:18 AM, 张铎 <palomino219@gmail.com>
> wrote:
> > > > > >
> > > > > >> Then I think we need to file an issue to change the protoc
> version
> > > > > >> installed in the precommit docker file to 3.x before the
merge.
> > > > > Otherwise
> > > > > >> the precommit build for protoc check maybe broken after
the
> > merge...
> > > > > >>
> > > > > >>
> > > > > >> 2016-10-03 1:18 GMT+08:00 Andrew Purtell <
> > andrew.purtell@gmail.com
> > > >:
> > > > > >>
> > > > > >> > I have 2.5 and 3.0 installed as /opt/protobuf-<version>,
and
> > have
> > > > bash
> > > > > >> > scripts that add the appropriate version's bin directory
to
> the
> > > > path.
> > > > > >> Not
> > > > > >> > particularly onerous as I also have to switch between
required
> > JDK
> > > > > >> > versions, so the scripts also set JAVA_HOME at and
add JDK bin
> > to
> > > > the
> > > > > >> path
> > > > > >> > for the required JDK for the build.
> > > > > >> >
> > > > > >> > Unlike with the scala compiler, which is after all
JVM
> bytecode
> > > at a
> > > > > >> > fundamental level, I don't think maven automation for
> automatic
> > > > > download
> > > > > >> > and execution is possible. protoc is a native binary.
> > > > > >> >
> > > > > >> > > On Oct 1, 2016, at 11:30 PM, 张铎 <palomino219@gmail.com>
> > wrote:
> > > > > >> > >
> > > > > >> > > Do we need to install protoc 3.0 manully before
building
> HBase
> > > or
> > > > > the
> > > > > >> > maven
> > > > > >> > > protobuf plugin will automatically download the
protoc
> > compiler?
> > > > > >> Maybe we
> > > > > >> > > need to install protoc 3.0 in the precommit docker
file.
> > > > > >> > >
> > > > > >> > > 2016-10-02 14:20 GMT+08:00 张铎 <palomino219@gmail.com>:
> > > > > >> > >
> > > > > >> > >>
> > > > > >> > >> 2016-10-02 0:50 GMT+08:00 Stack <stack@duboce.net>:
> > > > > >> > >>
> > > > > >> > >>>> On Sat, Oct 1, 2016 at 7:20 AM, 张铎
<
> palomino219@gmail.com>
> > > > > wrote:
> > > > > >> > >>>>
> > > > > >> > >>>> Can we setup a compatibility checker
job in jenkins?
> Start
> > a
> > > > > >> > >>> minicluster in
> > > > > >> > >>>> one process, and use a client in another
process to
> > > communicate
> > > > > >> with
> > > > > >> > it.
> > > > > >> > >>>> The version of the client should be
>= 0.98 and <= the
> > > version
> > > > of
> > > > > >> the
> > > > > >> > >>>> minicluster. Of course we need to
design the testing code
> > > > > >> carefully to
> > > > > >> > >>> make
> > > > > >> > >>>> sure that we have tested all the cases.
> > > > > >> > >>>>
> > > > > >> > >>>>
> > > > > >> > >>> +1. We need this up and running before
we put out an
> > > > hbase-2.0.0.
> > > > > I
> > > > > >> > know
> > > > > >> > >>> Matteo does this test manually on a regular
basis but a
> > > > > >> formalization
> > > > > >> > >>> would
> > > > > >> > >>> help. I can add an exercise of Coprocessor
Endpoints. I
> > > believe
> > > > > this
> > > > > >> > is on
> > > > > >> > >>> Dima's list of TODOs but will let him
speak for himself.
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>> And also I think we should make sure
that no proto3 only
> > > > feature
> > > > > is
> > > > > >> > >>>> introduced in our proto files until
branch-1 is dead.
> > Maybe a
> > > > > >> > precommit
> > > > > >> > >>>> check?
> > > > > >> > >>>>
> > > > > >> > >>>>
> > > > > >> > >>> I think you mean wire-format breaking
changes?  Agree. We
> > have
> > > > our
> > > > > >> PB3
> > > > > >> > set
> > > > > >> > >>> to 2.5 compat mode and yes, we can't move
on from this
> until
> > > we
> > > > > are
> > > > > >> in
> > > > > >> > a
> > > > > >> > >>> place where we can say no to 2.5 clients.
> > > > > >> > >>>
> > > > > >> > >> Yes, for example, pb2.5 does not support map
so we should
> not
> > > use
> > > > > >> map in
> > > > > >> > >> our proto files.
> > > > > >> > >>
> > > > > >> > >>>
> > > > > >> > >>> Making use of PB3isms cannot be avoided.
PB3.1 adds a
> native
> > > > > >> > replacement
> > > > > >> > >>> for our HBaseZeroCopyByteString/ByteStringer
hack. It
> also
> > > adds
> > > > > >> > 'unsafe'
> > > > > >> > >>> methods that we need to exploit if we
are to keep our
> > > read/write
> > > > > >> paths
> > > > > >> > >>> offheap.
> > > > > >> > >>>
> > > > > >> > >>> St.Ack
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>> Thanks.
> > > > > >> > >>>>
> > > > > >> > >>>> 2016-10-01 11:55 GMT+08:00 Sean Busbey
<
> > busbey@cloudera.com
> > > >:
> > > > > >> > >>>>
> > > > > >> > >>>>> have we experimentally confirmed
that wire compatibility
> > is
> > > > > >> > >>>>> maintained? I saw one mention
of expecting wire
> > > compatibility
> > > > to
> > > > > >> be
> > > > > >> > >>>>> fine, but nothing with someone
using e.g. the
> clusterdock
> > > work
> > > > > or
> > > > > >> > >>>>> something to mix servers / clients
or do replication.
> > > > > >> > >>>>>
> > > > > >> > >>>>>> On Fri, Sep 30, 2016 at 6:30
PM, Stack <
> stack@duboce.net
> > >
> > > > > wrote:
> > > > > >> > >>>>>> I intend to do a mass commit
late this weekend that
> moves
> > > us
> > > > on
> > > > > >> to a
> > > > > >> > >>>>> shaded
> > > > > >> > >>>>>> protobuf-3.1.0, either Sunday
night or Monday morning.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> If objection, please speak
up or if need more time for
> > > > > >> > >>>>>> consideration/review, just
shout.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> I want to merge the branch
HBASE-16264 into master (it
> is
> > > > > running
> > > > > >> > >>> here
> > > > > >> > >>>> up
> > > > > >> > >>>>>> on jenkins https://builds.apache.org/view
> > > > > >> /H-L/view/HBase/job/HBASE-
> > > > > >> > >>>>> 16264/).
> > > > > >> > >>>>>> The branch at HBASE-16264
has three significant
> > > > bodies-of-work
> > > > > >> that
> > > > > >> > >>>>>> unfortunately are tangled
and can only go in of a
> piece.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> * HBASE-16264 <https://issues.apache.org/
> > > > > jira/browse/HBASE-16264
> > > > > >> >
> > > > > >> > >>> The
> > > > > >> > >>>>>> shading of our protobuf usage
so we can upgrade and/or
> > run
> > > > > with a
> > > > > >> > >>>> patched
> > > > > >> > >>>>>> protobuf WITHOUT breaking
REST, Spark, and in
> particular,
> > > > > >> > >>> Coprocessor
> > > > > >> > >>>>>> Endpoints.
> > > > > >> > >>>>>> * HBASE-16567 <https://issues.apache.org/
> > > > > jira/browse/HBASE-16567
> > > > > >> >
> > > > > >> > >>> A
> > > > > >> > >>>>> move
> > > > > >> > >>>>>> up on to (shaded) protobuf-3.1.0
> > > > > >> > >>>>>> * HBASE-16741 <https://issues.apache.org/
> > > > > jira/browse/HBASE-16741
> > > > > >> >
> > > > > >> > >>> An
> > > > > >> > >>>>>> amendment of our generate
protobufs step to include
> > shading
> > > > > and a
> > > > > >> > >>>>> bundling
> > > > > >> > >>>>>> of protobuf src (with a means
of calling a patch srcs
> > hook)
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> Together we're talking about
40MB of change mostly made
> > of
> > > > the
> > > > > >> > >>> movement
> > > > > >> > >>>>> of
> > > > > >> > >>>>>> generated files or the application
of a pattern that
> > alters
> > > > > >> where we
> > > > > >> > >>>> get
> > > > > >> > >>>>>> imports from. When done, you
should notice no
> difference
> > > and
> > > > > >> should
> > > > > >> > >>> be
> > > > > >> > >>>>> able
> > > > > >> > >>>>>> to go about your business
as per usual. Upside is that
> we
> > > > will
> > > > > be
> > > > > >> > >>> able
> > > > > >> > >>>> to
> > > > > >> > >>>>>> avoid coming onheap doing
protobuf
> > > marshalling/unmarshalling
> > > > as
> > > > > >> > >>>> protobuf
> > > > > >> > >>>>>> 2.5.0 requires. Downside is
that we repeat a good
> portion
> > > of
> > > > > our
> > > > > >> > >>>> internal
> > > > > >> > >>>>>> protos, once non-shaded so
Coprocessor Endpoints can
> keep
> > > > > working
> > > > > >> > >>> and
> > > > > >> > >>>>> then
> > > > > >> > >>>>>> again as shaded for internal
use.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> I provide some more overview
below on the changes. See
> > the
> > > > > >> shading
> > > > > >> > >>> doc
> > > > > >> > >>>>>> here:
> > > > > >> > >>>>>> https://docs.google.com/document/d/
> > > 1H4NgLXQ9Y9KejwobddCqaVME
> > > > > >> DCGby
> > > > > >> > >>>>> DcXtdF5iAfDIEk/edit#
> > > > > >> > >>>>>> for more detail (Patches are
up on review board --
> except
> > > the
> > > > > >> latest
> > > > > >> > >>>>>> HBASE-16264 which is too big
for JIRA and RB). I am
> > > currently
> > > > > >> > >>> working
> > > > > >> > >>>> on
> > > > > >> > >>>>> a
> > > > > >> > >>>>>> devs chapter for the book
on protobuf going forward
> that
> > > will
> > > > > go
> > > > > >> in
> > > > > >> > >>> as
> > > > > >> > >>>>> part
> > > > > >> > >>>>>> of this patch.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> Thanks,
> > > > > >> > >>>>>> St.Ack
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> Items of note:
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> * Two new modules; one named
hbase-protocol-shaded that
> > is
> > > > used
> > > > > >> by
> > > > > >> > >>>> hbase
> > > > > >> > >>>>>> core. It has in it a shaded
(and later patched)
> protobuf.
> > > The
> > > > > >> other
> > > > > >> > >>> new
> > > > > >> > >>>>>> module is hbase-endpoint which
goes after hbase-server
> > and
> > > > has
> > > > > >> those
> > > > > >> > >>>>>> bundled endpoints that I was
able to break out of core
> > > (there
> > > > > >> are a
> > > > > >> > >>> few
> > > > > >> > >>>>>> that are hopelessly entangled
that need to be undone as
> > > CPEPs
> > > > > but
> > > > > >> > >>>>>> fortunately belong in core:
Auth, Access, MultiRow).
> > > > > >> > >>>>>> * I've tested running a branch-1
CPEP against a master
> > with
> > > > > these
> > > > > >> > >>>>> patches
> > > > > >> > >>>>>> in place and stuff like ACL
(A CPEP) run from the
> > branch-1
> > > > > shell
> > > > > >> > >>> work
> > > > > >> > >>>>>> against the branch-2 server.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>> On Mon, Aug 22, 2016 at
5:20 PM, Stack <
> > stack@duboce.net>
> > > > > >> wrote:
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>> This project goes on.
I updated HBASE-1563 "Shade
> > > protobuf"
> > > > > with
> > > > > >> > >>> some
> > > > > >> > >>>>> doc
> > > > > >> > >>>>>>> on a final approach. We
need to be able to refer to
> both
> > > > > shaded
> > > > > >> and
> > > > > >> > >>>>>>> non-shaded protobuf so
we can support sending HDFS
> > > > old-school
> > > > > pb
> > > > > >> > >>>>> Messages
> > > > > >> > >>>>>>> but also so Coprocessor
Endpoints keep working though
> > > > > internally
> > > > > >> > >>>>> protobufs
> > > > > >> > >>>>>>> have been relocated. Funny
you should ask, but yes,
> > there
> > > > are
> > > > > >> some
> > > > > >> > >>>>>>> downsides (as predicted
by contributors on the JIRA).
> > I'd
> > > be
> > > > > >> > >>>> interested
> > > > > >> > >>>>> to
> > > > > >> > >>>>>>> hear if they are too burdensome.
In particular, your
> IDE
> > > > > >> experience
> > > > > >> > >>>>> gets a
> > > > > >> > >>>>>>> little convoluted as you
will need to add to your
> build
> > > > path,
> > > > > a
> > > > > >> jar
> > > > > >> > >>>> with
> > > > > >> > >>>>>>> the relocated pbs. A pain.
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>> Thanks,
> > > > > >> > >>>>>>> St.Ack
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>>> On Wed, Apr 13, 2016
at 6:09 AM, Stack <
> > stack@duboce.net
> > > >
> > > > > >> wrote:
> > > > > >> > >>>>>>>>
> > > > > >> > >>>>>>>> On Tue, Apr 12, 2016
at 9:26 PM, Sean Busbey <
> > > > > >> busbey@apache.org>
> > > > > >> > >>>>> wrote:
> > > > > >> > >>>>>>>>
> > > > > >> > >>>>>>>>>> On Tue, Apr
12, 2016 at 6:17 PM, Stack <
> > > stack@duboce.net
> > > > >
> > > > > >> > wrote:
> > > > > >> > >>>>>>>>>>
> > > > > >> > >>>>>>>>>>
> > > > > >> > >>>>>>>>>> On an initial
pass, the only difficult part seems
> to
> > be
> > > > > >> > >>>> interaction
> > > > > >> > >>>>>>>>> with
> > > > > >> > >>>>>>>>>> HDFS in asyncwal
(might just pull in the HDFS
> > > messages).
> > > > > >> > >>>>>>>>>>
> > > > > >> > >>>>>>>>>>
> > > > > >> > >>>>>>>>>
> > > > > >> > >>>>>>>>> I have some idea
how we can make this work either by
> > > > pushing
> > > > > >> > >>>> asyncwal
> > > > > >> > >>>>>>>>> upstream to HDFS
or through some maven tricks,
> > depending
> > > > on
> > > > > >> how
> > > > > >> > >>> much
> > > > > >> > >>>>>>>>> time we have.
> > > > > >> > >>>>>>>>>
> > > > > >> > >>>>>>>>
> > > > > >> > >>>>>>>> Maven tricks? Tell
us more. Here or drop a note up in
> > the
> > > > > >> issue.
> > > > > >> > >>>>>>>> Thanks Sean,
> > > > > >> > >>>>>>>> St.Ack
> > > > > >> > >>>>>>>>
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>
> > > > > >> > >>>>>
> > > > > >> > >>>>>
> > > > > >> > >>>>> --
> > > > > >> > >>>>> busbey
> > > > > >> > >>>>>
> > > > > >> > >>>>
> > > > > >> > >>>
> > > > > >> > >>
> > > > > >> > >>
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message