hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: [NOTICE] Merge of shaded protobuf 3.1.0 (WAS => [DISCUSS] Shade protobuf so we can move to a newer version)
Date Tue, 04 Oct 2016 03:31:37 GMT
On Mon, Oct 3, 2016 at 5:40 PM, 张铎 <palomino219@gmail.com> wrote:

> OK, this means we will still use pb2.5 for hbase-protocol? And
> hbase-protocol-shaded is maintained manually? I still think the precommit
> job should have the ability to check the hbase-protocol-shaded...
>
>
Yes. For now. Let me test some more. I think moving all protos to be pb3.1
will work but will do that after this patch lands (with its pb3.1 for
internal use and pb2.5 for CPEPs, REST, etc.). All on pb3.1 would make for
a cleaner story.



> I skimmed the branch, the proto files are placed both in hbase-protocol and
> hbase-protocol-shaded?



Yes.  A subset are duplicated. I expended some effort minimizing the set
but our proto files are way to granular. Files like Client.proto and
HBase.proto are way too big with too many concerns all collected together
inside them. It makes it tough drawing the line between internal proto use
and CPEP protos. In the absence, we have a larger overlap between protos
used internally and those exposed for CPEP use.

Going forward, I'll add to the doc that many small files is better than a
few big ones.



> The one in hbase-protocol is only used to support
> coprocessor? How can we sync them? Also manually? Or they do not need to be
> synced?
>
>
It'd be tough syncing. They are not exactly the same file given the build
into different locations. It is not the end-of-the world if they aren't
sync'd; the CPEP will keep working. Better is that we abandon
hbase-protocol and start up a new module with published protos that we will
honor with the annotation public.

This project has revealed a bunch of API surface previously was underground.

Let me file some issues.

Thanks Duo,

St.Ack



> Thanks.
>
> 2016-10-04 2:09 GMT+08:00 Stack <stack@duboce.net>:
>
> > On Mon, Oct 3, 2016 at 9:41 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > The precommit job uses compile-protobuf profile for verification. The
> > > absence of compile-protobuf profile in hbase-protocol-shaded module
> means
> > > precommit job would only invoke the existing compile-protobuf profile
> > > in hbase-protocol
> > > module.
> > >
> > w.r.t. path, when two protoc executables (corresponding to 2.5 and 3.x,
> > > respectively) are available, would maven know which one to pick for
> > > the hbase-protocol and hbase-protocol-shaded modules ?
> > >
> > >
> >
> > For the former, right, nothing changed.
> >
>
> > On the latter, it is a contrivance, a problem we do not have. The build
> > doesn't have to pick between pb 2.5 vs pb 3.1. We don't run protoc as
> part
> > of our build. It is done apart from the build by the developer and their
> > results are then checked-in. That continues as it was post-patch. We just
> > do 3.1 now instead of 2.5.
> >
> > I'll check in a bit of doc as part of this commit that hopefully will
> help
> > make this all clearer.
> >
> > St.Ack
> >
> >
> >
> >
> >
> > > Cheers
> > >
> > > On Mon, Oct 3, 2016 at 9:32 AM, Stack <stack@duboce.net> wrote:
> > >
> > > > On Mon, Oct 3, 2016 at 7:16 AM, Ted Yu <yuzhihong@gmail.com> wrote:
> > > >
> > > > > Looks like compile-protobuf profile is not in
> > > > hbase-protocol-shaded/pom.xml
> > > > > (in HBASE-16264 branch)
> > > > >
> > > > >
> > > > Sorry. I don't get what you are saying here
> > > >
> > > > (The target in the new module is generate-sources. See the included
> > > README.
> > > > This step does more work now more than just generating protocs, hence
> > new
> > > > profile name.)
> > > >
> > > >
> > > >
> > > > > Seems precommit jobs should pass with the current formation.
> > > > >
> > > > >
> > > > Are you stating that this patch is likely to build? (Yes, the patch I
> > > > submitted builds).
> > > >
> > > >
> > > >
> > > > > In the future, if we add another profile for compiling proto3
> files,
> > we
> > > > > need to specify the path to proto3 compiler.
> > > > >
> > > > >
> > > >
> > > > > Please correct me if I am wrong.
> > > > >
> > > > >
> > > > I don't know what you are asking. Why do we have to specify 'paths'?
> We
> > > > don't have to currently (See the plugin we use generating protos now
> > from
> > > > hadoop). Maybe you are trying to distinguish the production of
> > > protobuf2.5
> > > > vs 3.1 protos but these are isolated by module....
> > > >
> > > >
> > > > I said I'd commit this morning but let me wait a while. There may be
> > some
> > > > more questions/objections and I'd like to have a clean build up on
> > > jenkins
> > > > here [1] before I commit (jenkins is being ornery).
> > > >
> > > > St.Ack
> > > > 1.
> > > > https://builds.apache.org/job/HBASE-16264/jdk=JDK%201.8%20(
> > > > latest),label=yahoo-not-h2/28/console
> > > >
> > > > Service Unavailable
> > > >
> > > > The server is temporarily unable to service your
> > > > request due to maintenance downtime or capacity
> > > > problems. Please try again later.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > > On Mon, Oct 3, 2016 at 6:58 AM, Ted Yu <yuzhihong@gmail.com>
> wrote:
> > > > >
> > > > > > The protoc generated files (such as MasterProtos) are checked
> into
> > > > source
> > > > > > repo, right ?
> > > > > >
> > > > > > Do we need proto3 on the precommit image(s) ?
> > > > > >
> > > > > > On Mon, Oct 3, 2016 at 5:18 AM, 张铎 <palomino219@gmail.com>
> wrote:
> > > > > >
> > > > > >> Then I think we need to file an issue to change the protoc
> version
> > > > > >> installed in the precommit docker file to 3.x before the
merge.
> > > > > Otherwise
> > > > > >> the precommit build for protoc check maybe broken after
the
> > merge...
> > > > > >>
> > > > > >>
> > > > > >> 2016-10-03 1:18 GMT+08:00 Andrew Purtell <
> > andrew.purtell@gmail.com
> > > >:
> > > > > >>
> > > > > >> > I have 2.5 and 3.0 installed as /opt/protobuf-<version>,
and
> > have
> > > > bash
> > > > > >> > scripts that add the appropriate version's bin directory
to
> the
> > > > path.
> > > > > >> Not
> > > > > >> > particularly onerous as I also have to switch between
required
> > JDK
> > > > > >> > versions, so the scripts also set JAVA_HOME at and
add JDK bin
> > to
> > > > the
> > > > > >> path
> > > > > >> > for the required JDK for the build.
> > > > > >> >
> > > > > >> > Unlike with the scala compiler, which is after all
JVM
> bytecode
> > > at a
> > > > > >> > fundamental level, I don't think maven automation for
> automatic
> > > > > download
> > > > > >> > and execution is possible. protoc is a native binary.
> > > > > >> >
> > > > > >> > > On Oct 1, 2016, at 11:30 PM, 张铎 <palomino219@gmail.com>
> > wrote:
> > > > > >> > >
> > > > > >> > > Do we need to install protoc 3.0 manully before
building
> HBase
> > > or
> > > > > the
> > > > > >> > maven
> > > > > >> > > protobuf plugin will automatically download the
protoc
> > compiler?
> > > > > >> Maybe we
> > > > > >> > > need to install protoc 3.0 in the precommit docker
file.
> > > > > >> > >
> > > > > >> > > 2016-10-02 14:20 GMT+08:00 张铎 <palomino219@gmail.com>:
> > > > > >> > >
> > > > > >> > >>
> > > > > >> > >> 2016-10-02 0:50 GMT+08:00 Stack <stack@duboce.net>:
> > > > > >> > >>
> > > > > >> > >>>> On Sat, Oct 1, 2016 at 7:20 AM, 张铎
<
> palomino219@gmail.com>
> > > > > wrote:
> > > > > >> > >>>>
> > > > > >> > >>>> Can we setup a compatibility checker
job in jenkins?
> Start
> > a
> > > > > >> > >>> minicluster in
> > > > > >> > >>>> one process, and use a client in another
process to
> > > communicate
> > > > > >> with
> > > > > >> > it.
> > > > > >> > >>>> The version of the client should be
>= 0.98 and <= the
> > > version
> > > > of
> > > > > >> the
> > > > > >> > >>>> minicluster. Of course we need to
design the testing code
> > > > > >> carefully to
> > > > > >> > >>> make
> > > > > >> > >>>> sure that we have tested all the cases.
> > > > > >> > >>>>
> > > > > >> > >>>>
> > > > > >> > >>> +1. We need this up and running before
we put out an
> > > > hbase-2.0.0.
> > > > > I
> > > > > >> > know
> > > > > >> > >>> Matteo does this test manually on a regular
basis but a
> > > > > >> formalization
> > > > > >> > >>> would
> > > > > >> > >>> help. I can add an exercise of Coprocessor
Endpoints. I
> > > believe
> > > > > this
> > > > > >> > is on
> > > > > >> > >>> Dima's list of TODOs but will let him
speak for himself.
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>> And also I think we should make sure
that no proto3 only
> > > > feature
> > > > > is
> > > > > >> > >>>> introduced in our proto files until
branch-1 is dead.
> > Maybe a
> > > > > >> > precommit
> > > > > >> > >>>> check?
> > > > > >> > >>>>
> > > > > >> > >>>>
> > > > > >> > >>> I think you mean wire-format breaking
changes?  Agree. We
> > have
> > > > our
> > > > > >> PB3
> > > > > >> > set
> > > > > >> > >>> to 2.5 compat mode and yes, we can't move
on from this
> until
> > > we
> > > > > are
> > > > > >> in
> > > > > >> > a
> > > > > >> > >>> place where we can say no to 2.5 clients.
> > > > > >> > >>>
> > > > > >> > >> Yes, for example, pb2.5 does not support map
so we should
> not
> > > use
> > > > > >> map in
> > > > > >> > >> our proto files.
> > > > > >> > >>
> > > > > >> > >>>
> > > > > >> > >>> Making use of PB3isms cannot be avoided.
PB3.1 adds a
> native
> > > > > >> > replacement
> > > > > >> > >>> for our HBaseZeroCopyByteString/ByteStringer
hack. It
> also
> > > adds
> > > > > >> > 'unsafe'
> > > > > >> > >>> methods that we need to exploit if we
are to keep our
> > > read/write
> > > > > >> paths
> > > > > >> > >>> offheap.
> > > > > >> > >>>
> > > > > >> > >>> St.Ack
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>
> > > > > >> > >>>> Thanks.
> > > > > >> > >>>>
> > > > > >> > >>>> 2016-10-01 11:55 GMT+08:00 Sean Busbey
<
> > busbey@cloudera.com
> > > >:
> > > > > >> > >>>>
> > > > > >> > >>>>> have we experimentally confirmed
that wire compatibility
> > is
> > > > > >> > >>>>> maintained? I saw one mention
of expecting wire
> > > compatibility
> > > > to
> > > > > >> be
> > > > > >> > >>>>> fine, but nothing with someone
using e.g. the
> clusterdock
> > > work
> > > > > or
> > > > > >> > >>>>> something to mix servers / clients
or do replication.
> > > > > >> > >>>>>
> > > > > >> > >>>>>> On Fri, Sep 30, 2016 at 6:30
PM, Stack <
> stack@duboce.net
> > >
> > > > > wrote:
> > > > > >> > >>>>>> I intend to do a mass commit
late this weekend that
> moves
> > > us
> > > > on
> > > > > >> to a
> > > > > >> > >>>>> shaded
> > > > > >> > >>>>>> protobuf-3.1.0, either Sunday
night or Monday morning.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> If objection, please speak
up or if need more time for
> > > > > >> > >>>>>> consideration/review, just
shout.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> I want to merge the branch
HBASE-16264 into master (it
> is
> > > > > running
> > > > > >> > >>> here
> > > > > >> > >>>> up
> > > > > >> > >>>>>> on jenkins https://builds.apache.org/view
> > > > > >> /H-L/view/HBase/job/HBASE-
> > > > > >> > >>>>> 16264/).
> > > > > >> > >>>>>> The branch at HBASE-16264
has three significant
> > > > bodies-of-work
> > > > > >> that
> > > > > >> > >>>>>> unfortunately are tangled
and can only go in of a
> piece.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> * HBASE-16264 <https://issues.apache.org/
> > > > > jira/browse/HBASE-16264
> > > > > >> >
> > > > > >> > >>> The
> > > > > >> > >>>>>> shading of our protobuf usage
so we can upgrade and/or
> > run
> > > > > with a
> > > > > >> > >>>> patched
> > > > > >> > >>>>>> protobuf WITHOUT breaking
REST, Spark, and in
> particular,
> > > > > >> > >>> Coprocessor
> > > > > >> > >>>>>> Endpoints.
> > > > > >> > >>>>>> * HBASE-16567 <https://issues.apache.org/
> > > > > jira/browse/HBASE-16567
> > > > > >> >
> > > > > >> > >>> A
> > > > > >> > >>>>> move
> > > > > >> > >>>>>> up on to (shaded) protobuf-3.1.0
> > > > > >> > >>>>>> * HBASE-16741 <https://issues.apache.org/
> > > > > jira/browse/HBASE-16741
> > > > > >> >
> > > > > >> > >>> An
> > > > > >> > >>>>>> amendment of our generate
protobufs step to include
> > shading
> > > > > and a
> > > > > >> > >>>>> bundling
> > > > > >> > >>>>>> of protobuf src (with a means
of calling a patch srcs
> > hook)
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> Together we're talking about
40MB of change mostly made
> > of
> > > > the
> > > > > >> > >>> movement
> > > > > >> > >>>>> of
> > > > > >> > >>>>>> generated files or the application
of a pattern that
> > alters
> > > > > >> where we
> > > > > >> > >>>> get
> > > > > >> > >>>>>> imports from. When done, you
should notice no
> difference
> > > and
> > > > > >> should
> > > > > >> > >>> be
> > > > > >> > >>>>> able
> > > > > >> > >>>>>> to go about your business
as per usual. Upside is that
> we
> > > > will
> > > > > be
> > > > > >> > >>> able
> > > > > >> > >>>> to
> > > > > >> > >>>>>> avoid coming onheap doing
protobuf
> > > marshalling/unmarshalling
> > > > as
> > > > > >> > >>>> protobuf
> > > > > >> > >>>>>> 2.5.0 requires. Downside is
that we repeat a good
> portion
> > > of
> > > > > our
> > > > > >> > >>>> internal
> > > > > >> > >>>>>> protos, once non-shaded so
Coprocessor Endpoints can
> keep
> > > > > working
> > > > > >> > >>> and
> > > > > >> > >>>>> then
> > > > > >> > >>>>>> again as shaded for internal
use.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> I provide some more overview
below on the changes. See
> > the
> > > > > >> shading
> > > > > >> > >>> doc
> > > > > >> > >>>>>> here:
> > > > > >> > >>>>>> https://docs.google.com/document/d/
> > > 1H4NgLXQ9Y9KejwobddCqaVME
> > > > > >> DCGby
> > > > > >> > >>>>> DcXtdF5iAfDIEk/edit#
> > > > > >> > >>>>>> for more detail (Patches are
up on review board --
> except
> > > the
> > > > > >> latest
> > > > > >> > >>>>>> HBASE-16264 which is too big
for JIRA and RB). I am
> > > currently
> > > > > >> > >>> working
> > > > > >> > >>>> on
> > > > > >> > >>>>> a
> > > > > >> > >>>>>> devs chapter for the book
on protobuf going forward
> that
> > > will
> > > > > go
> > > > > >> in
> > > > > >> > >>> as
> > > > > >> > >>>>> part
> > > > > >> > >>>>>> of this patch.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> Thanks,
> > > > > >> > >>>>>> St.Ack
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> Items of note:
> > > > > >> > >>>>>>
> > > > > >> > >>>>>> * Two new modules; one named
hbase-protocol-shaded that
> > is
> > > > used
> > > > > >> by
> > > > > >> > >>>> hbase
> > > > > >> > >>>>>> core. It has in it a shaded
(and later patched)
> protobuf.
> > > The
> > > > > >> other
> > > > > >> > >>> new
> > > > > >> > >>>>>> module is hbase-endpoint which
goes after hbase-server
> > and
> > > > has
> > > > > >> those
> > > > > >> > >>>>>> bundled endpoints that I was
able to break out of core
> > > (there
> > > > > >> are a
> > > > > >> > >>> few
> > > > > >> > >>>>>> that are hopelessly entangled
that need to be undone as
> > > CPEPs
> > > > > but
> > > > > >> > >>>>>> fortunately belong in core:
Auth, Access, MultiRow).
> > > > > >> > >>>>>> * I've tested running a branch-1
CPEP against a master
> > with
> > > > > these
> > > > > >> > >>>>> patches
> > > > > >> > >>>>>> in place and stuff like ACL
(A CPEP) run from the
> > branch-1
> > > > > shell
> > > > > >> > >>> work
> > > > > >> > >>>>>> against the branch-2 server.
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>
> > > > > >> > >>>>>>> On Mon, Aug 22, 2016 at
5:20 PM, Stack <
> > stack@duboce.net>
> > > > > >> wrote:
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>> This project goes on.
I updated HBASE-1563 "Shade
> > > protobuf"
> > > > > with
> > > > > >> > >>> some
> > > > > >> > >>>>> doc
> > > > > >> > >>>>>>> on a final approach. We
need to be able to refer to
> both
> > > > > shaded
> > > > > >> and
> > > > > >> > >>>>>>> non-shaded protobuf so
we can support sending HDFS
> > > > old-school
> > > > > pb
> > > > > >> > >>>>> Messages
> > > > > >> > >>>>>>> but also so Coprocessor
Endpoints keep working though
> > > > > internally
> > > > > >> > >>>>> protobufs
> > > > > >> > >>>>>>> have been relocated. Funny
you should ask, but yes,
> > there
> > > > are
> > > > > >> some
> > > > > >> > >>>>>>> downsides (as predicted
by contributors on the JIRA).
> > I'd
> > > be
> > > > > >> > >>>> interested
> > > > > >> > >>>>> to
> > > > > >> > >>>>>>> hear if they are too burdensome.
In particular, your
> IDE
> > > > > >> experience
> > > > > >> > >>>>> gets a
> > > > > >> > >>>>>>> little convoluted as you
will need to add to your
> build
> > > > path,
> > > > > a
> > > > > >> jar
> > > > > >> > >>>> with
> > > > > >> > >>>>>>> the relocated pbs. A pain.
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>> Thanks,
> > > > > >> > >>>>>>> St.Ack
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>>> On Wed, Apr 13, 2016
at 6:09 AM, Stack <
> > stack@duboce.net
> > > >
> > > > > >> wrote:
> > > > > >> > >>>>>>>>
> > > > > >> > >>>>>>>> On Tue, Apr 12, 2016
at 9:26 PM, Sean Busbey <
> > > > > >> busbey@apache.org>
> > > > > >> > >>>>> wrote:
> > > > > >> > >>>>>>>>
> > > > > >> > >>>>>>>>>> On Tue, Apr
12, 2016 at 6:17 PM, Stack <
> > > stack@duboce.net
> > > > >
> > > > > >> > wrote:
> > > > > >> > >>>>>>>>>>
> > > > > >> > >>>>>>>>>>
> > > > > >> > >>>>>>>>>> On an initial
pass, the only difficult part seems
> to
> > be
> > > > > >> > >>>> interaction
> > > > > >> > >>>>>>>>> with
> > > > > >> > >>>>>>>>>> HDFS in asyncwal
(might just pull in the HDFS
> > > messages).
> > > > > >> > >>>>>>>>>>
> > > > > >> > >>>>>>>>>>
> > > > > >> > >>>>>>>>>
> > > > > >> > >>>>>>>>> I have some idea
how we can make this work either by
> > > > pushing
> > > > > >> > >>>> asyncwal
> > > > > >> > >>>>>>>>> upstream to HDFS
or through some maven tricks,
> > depending
> > > > on
> > > > > >> how
> > > > > >> > >>> much
> > > > > >> > >>>>>>>>> time we have.
> > > > > >> > >>>>>>>>>
> > > > > >> > >>>>>>>>
> > > > > >> > >>>>>>>> Maven tricks? Tell
us more. Here or drop a note up in
> > the
> > > > > >> issue.
> > > > > >> > >>>>>>>> Thanks Sean,
> > > > > >> > >>>>>>>> St.Ack
> > > > > >> > >>>>>>>>
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>>>
> > > > > >> > >>>>>
> > > > > >> > >>>>>
> > > > > >> > >>>>>
> > > > > >> > >>>>> --
> > > > > >> > >>>>> busbey
> > > > > >> > >>>>>
> > > > > >> > >>>>
> > > > > >> > >>>
> > > > > >> > >>
> > > > > >> > >>
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message