hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: [NOTICE] Merge of shaded protobuf 3.1.0 (WAS => [DISCUSS] Shade protobuf so we can move to a newer version)
Date Tue, 04 Oct 2016 05:18:19 GMT
I just pushed the below set of patches. I'll be keeping an eye out on the
build.  Hopefully not too many teething issues.
St.Ack

On Mon, Oct 3, 2016 at 8:31 PM, Stack <stack@duboce.net> wrote:

> On Mon, Oct 3, 2016 at 5:40 PM, 张铎 <palomino219@gmail.com> wrote:
>
>> OK, this means we will still use pb2.5 for hbase-protocol? And
>> hbase-protocol-shaded is maintained manually? I still think the precommit
>> job should have the ability to check the hbase-protocol-shaded...
>>
>>
> Yes. For now. Let me test some more. I think moving all protos to be pb3.1
> will work but will do that after this patch lands (with its pb3.1 for
> internal use and pb2.5 for CPEPs, REST, etc.). All on pb3.1 would make for
> a cleaner story.
>
>
>
>> I skimmed the branch, the proto files are placed both in hbase-protocol
>> and
>> hbase-protocol-shaded?
>
>
>
> Yes.  A subset are duplicated. I expended some effort minimizing the set
> but our proto files are way to granular. Files like Client.proto and
> HBase.proto are way too big with too many concerns all collected together
> inside them. It makes it tough drawing the line between internal proto use
> and CPEP protos. In the absence, we have a larger overlap between protos
> used internally and those exposed for CPEP use.
>
> Going forward, I'll add to the doc that many small files is better than a
> few big ones.
>
>
>
>> The one in hbase-protocol is only used to support
>> coprocessor? How can we sync them? Also manually? Or they do not need to
>> be
>> synced?
>>
>>
> It'd be tough syncing. They are not exactly the same file given the build
> into different locations. It is not the end-of-the world if they aren't
> sync'd; the CPEP will keep working. Better is that we abandon
> hbase-protocol and start up a new module with published protos that we will
> honor with the annotation public.
>
> This project has revealed a bunch of API surface previously was
> underground.
>
> Let me file some issues.
>
> Thanks Duo,
>
> St.Ack
>
>
>
>> Thanks.
>>
>> 2016-10-04 2:09 GMT+08:00 Stack <stack@duboce.net>:
>>
>> > On Mon, Oct 3, 2016 at 9:41 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> >
>> > > The precommit job uses compile-protobuf profile for verification. The
>> > > absence of compile-protobuf profile in hbase-protocol-shaded module
>> means
>> > > precommit job would only invoke the existing compile-protobuf profile
>> > > in hbase-protocol
>> > > module.
>> > >
>> > w.r.t. path, when two protoc executables (corresponding to 2.5 and 3.x,
>> > > respectively) are available, would maven know which one to pick for
>> > > the hbase-protocol and hbase-protocol-shaded modules ?
>> > >
>> > >
>> >
>> > For the former, right, nothing changed.
>> >
>>
>> > On the latter, it is a contrivance, a problem we do not have. The build
>> > doesn't have to pick between pb 2.5 vs pb 3.1. We don't run protoc as
>> part
>> > of our build. It is done apart from the build by the developer and their
>> > results are then checked-in. That continues as it was post-patch. We
>> just
>> > do 3.1 now instead of 2.5.
>> >
>> > I'll check in a bit of doc as part of this commit that hopefully will
>> help
>> > make this all clearer.
>> >
>> > St.Ack
>> >
>> >
>> >
>> >
>> >
>> > > Cheers
>> > >
>> > > On Mon, Oct 3, 2016 at 9:32 AM, Stack <stack@duboce.net> wrote:
>> > >
>> > > > On Mon, Oct 3, 2016 at 7:16 AM, Ted Yu <yuzhihong@gmail.com>
wrote:
>> > > >
>> > > > > Looks like compile-protobuf profile is not in
>> > > > hbase-protocol-shaded/pom.xml
>> > > > > (in HBASE-16264 branch)
>> > > > >
>> > > > >
>> > > > Sorry. I don't get what you are saying here
>> > > >
>> > > > (The target in the new module is generate-sources. See the included
>> > > README.
>> > > > This step does more work now more than just generating protocs,
>> hence
>> > new
>> > > > profile name.)
>> > > >
>> > > >
>> > > >
>> > > > > Seems precommit jobs should pass with the current formation.
>> > > > >
>> > > > >
>> > > > Are you stating that this patch is likely to build? (Yes, the patch
>> I
>> > > > submitted builds).
>> > > >
>> > > >
>> > > >
>> > > > > In the future, if we add another profile for compiling proto3
>> files,
>> > we
>> > > > > need to specify the path to proto3 compiler.
>> > > > >
>> > > > >
>> > > >
>> > > > > Please correct me if I am wrong.
>> > > > >
>> > > > >
>> > > > I don't know what you are asking. Why do we have to specify
>> 'paths'? We
>> > > > don't have to currently (See the plugin we use generating protos now
>> > from
>> > > > hadoop). Maybe you are trying to distinguish the production of
>> > > protobuf2.5
>> > > > vs 3.1 protos but these are isolated by module....
>> > > >
>> > > >
>> > > > I said I'd commit this morning but let me wait a while. There may
be
>> > some
>> > > > more questions/objections and I'd like to have a clean build up on
>> > > jenkins
>> > > > here [1] before I commit (jenkins is being ornery).
>> > > >
>> > > > St.Ack
>> > > > 1.
>> > > > https://builds.apache.org/job/HBASE-16264/jdk=JDK%201.8%20(
>> > > > latest),label=yahoo-not-h2/28/console
>> > > >
>> > > > Service Unavailable
>> > > >
>> > > > The server is temporarily unable to service your
>> > > > request due to maintenance downtime or capacity
>> > > > problems. Please try again later.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > > On Mon, Oct 3, 2016 at 6:58 AM, Ted Yu <yuzhihong@gmail.com>
>> wrote:
>> > > > >
>> > > > > > The protoc generated files (such as MasterProtos) are checked
>> into
>> > > > source
>> > > > > > repo, right ?
>> > > > > >
>> > > > > > Do we need proto3 on the precommit image(s) ?
>> > > > > >
>> > > > > > On Mon, Oct 3, 2016 at 5:18 AM, 张铎 <palomino219@gmail.com>
>> wrote:
>> > > > > >
>> > > > > >> Then I think we need to file an issue to change the
protoc
>> version
>> > > > > >> installed in the precommit docker file to 3.x before
the merge.
>> > > > > Otherwise
>> > > > > >> the precommit build for protoc check maybe broken after
the
>> > merge...
>> > > > > >>
>> > > > > >>
>> > > > > >> 2016-10-03 1:18 GMT+08:00 Andrew Purtell <
>> > andrew.purtell@gmail.com
>> > > >:
>> > > > > >>
>> > > > > >> > I have 2.5 and 3.0 installed as /opt/protobuf-<version>,
and
>> > have
>> > > > bash
>> > > > > >> > scripts that add the appropriate version's bin
directory to
>> the
>> > > > path.
>> > > > > >> Not
>> > > > > >> > particularly onerous as I also have to switch between
>> required
>> > JDK
>> > > > > >> > versions, so the scripts also set JAVA_HOME at
and add JDK
>> bin
>> > to
>> > > > the
>> > > > > >> path
>> > > > > >> > for the required JDK for the build.
>> > > > > >> >
>> > > > > >> > Unlike with the scala compiler, which is after
all JVM
>> bytecode
>> > > at a
>> > > > > >> > fundamental level, I don't think maven automation
for
>> automatic
>> > > > > download
>> > > > > >> > and execution is possible. protoc is a native binary.
>> > > > > >> >
>> > > > > >> > > On Oct 1, 2016, at 11:30 PM, 张铎 <palomino219@gmail.com>
>> > wrote:
>> > > > > >> > >
>> > > > > >> > > Do we need to install protoc 3.0 manully before
building
>> HBase
>> > > or
>> > > > > the
>> > > > > >> > maven
>> > > > > >> > > protobuf plugin will automatically download
the protoc
>> > compiler?
>> > > > > >> Maybe we
>> > > > > >> > > need to install protoc 3.0 in the precommit
docker file.
>> > > > > >> > >
>> > > > > >> > > 2016-10-02 14:20 GMT+08:00 张铎 <palomino219@gmail.com>:
>> > > > > >> > >
>> > > > > >> > >>
>> > > > > >> > >> 2016-10-02 0:50 GMT+08:00 Stack <stack@duboce.net>:
>> > > > > >> > >>
>> > > > > >> > >>>> On Sat, Oct 1, 2016 at 7:20 AM,
张铎 <
>> palomino219@gmail.com>
>> > > > > wrote:
>> > > > > >> > >>>>
>> > > > > >> > >>>> Can we setup a compatibility checker
job in jenkins?
>> Start
>> > a
>> > > > > >> > >>> minicluster in
>> > > > > >> > >>>> one process, and use a client
in another process to
>> > > communicate
>> > > > > >> with
>> > > > > >> > it.
>> > > > > >> > >>>> The version of the client should
be >= 0.98 and <= the
>> > > version
>> > > > of
>> > > > > >> the
>> > > > > >> > >>>> minicluster. Of course we need
to design the testing
>> code
>> > > > > >> carefully to
>> > > > > >> > >>> make
>> > > > > >> > >>>> sure that we have tested all the
cases.
>> > > > > >> > >>>>
>> > > > > >> > >>>>
>> > > > > >> > >>> +1. We need this up and running before
we put out an
>> > > > hbase-2.0.0.
>> > > > > I
>> > > > > >> > know
>> > > > > >> > >>> Matteo does this test manually on
a regular basis but a
>> > > > > >> formalization
>> > > > > >> > >>> would
>> > > > > >> > >>> help. I can add an exercise of Coprocessor
Endpoints. I
>> > > believe
>> > > > > this
>> > > > > >> > is on
>> > > > > >> > >>> Dima's list of TODOs but will let
him speak for himself.
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>> And also I think we should make
sure that no proto3 only
>> > > > feature
>> > > > > is
>> > > > > >> > >>>> introduced in our proto files
until branch-1 is dead.
>> > Maybe a
>> > > > > >> > precommit
>> > > > > >> > >>>> check?
>> > > > > >> > >>>>
>> > > > > >> > >>>>
>> > > > > >> > >>> I think you mean wire-format breaking
changes?  Agree. We
>> > have
>> > > > our
>> > > > > >> PB3
>> > > > > >> > set
>> > > > > >> > >>> to 2.5 compat mode and yes, we can't
move on from this
>> until
>> > > we
>> > > > > are
>> > > > > >> in
>> > > > > >> > a
>> > > > > >> > >>> place where we can say no to 2.5 clients.
>> > > > > >> > >>>
>> > > > > >> > >> Yes, for example, pb2.5 does not support
map so we should
>> not
>> > > use
>> > > > > >> map in
>> > > > > >> > >> our proto files.
>> > > > > >> > >>
>> > > > > >> > >>>
>> > > > > >> > >>> Making use of PB3isms cannot be avoided.
PB3.1 adds a
>> native
>> > > > > >> > replacement
>> > > > > >> > >>> for our HBaseZeroCopyByteString/ByteStringer
hack. It
>> also
>> > > adds
>> > > > > >> > 'unsafe'
>> > > > > >> > >>> methods that we need to exploit if
we are to keep our
>> > > read/write
>> > > > > >> paths
>> > > > > >> > >>> offheap.
>> > > > > >> > >>>
>> > > > > >> > >>> St.Ack
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>
>> > > > > >> > >>>> Thanks.
>> > > > > >> > >>>>
>> > > > > >> > >>>> 2016-10-01 11:55 GMT+08:00 Sean
Busbey <
>> > busbey@cloudera.com
>> > > >:
>> > > > > >> > >>>>
>> > > > > >> > >>>>> have we experimentally confirmed
that wire
>> compatibility
>> > is
>> > > > > >> > >>>>> maintained? I saw one mention
of expecting wire
>> > > compatibility
>> > > > to
>> > > > > >> be
>> > > > > >> > >>>>> fine, but nothing with someone
using e.g. the
>> clusterdock
>> > > work
>> > > > > or
>> > > > > >> > >>>>> something to mix servers /
clients or do replication.
>> > > > > >> > >>>>>
>> > > > > >> > >>>>>> On Fri, Sep 30, 2016 at
6:30 PM, Stack <
>> stack@duboce.net
>> > >
>> > > > > wrote:
>> > > > > >> > >>>>>> I intend to do a mass
commit late this weekend that
>> moves
>> > > us
>> > > > on
>> > > > > >> to a
>> > > > > >> > >>>>> shaded
>> > > > > >> > >>>>>> protobuf-3.1.0, either
Sunday night or Monday morning.
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>> If objection, please speak
up or if need more time for
>> > > > > >> > >>>>>> consideration/review,
just shout.
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>> I want to merge the branch
HBASE-16264 into master
>> (it is
>> > > > > running
>> > > > > >> > >>> here
>> > > > > >> > >>>> up
>> > > > > >> > >>>>>> on jenkins https://builds.apache.org/view
>> > > > > >> /H-L/view/HBase/job/HBASE-
>> > > > > >> > >>>>> 16264/).
>> > > > > >> > >>>>>> The branch at HBASE-16264
has three significant
>> > > > bodies-of-work
>> > > > > >> that
>> > > > > >> > >>>>>> unfortunately are tangled
and can only go in of a
>> piece.
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>> * HBASE-16264 <https://issues.apache.org/
>> > > > > jira/browse/HBASE-16264
>> > > > > >> >
>> > > > > >> > >>> The
>> > > > > >> > >>>>>> shading of our protobuf
usage so we can upgrade and/or
>> > run
>> > > > > with a
>> > > > > >> > >>>> patched
>> > > > > >> > >>>>>> protobuf WITHOUT breaking
REST, Spark, and in
>> particular,
>> > > > > >> > >>> Coprocessor
>> > > > > >> > >>>>>> Endpoints.
>> > > > > >> > >>>>>> * HBASE-16567 <https://issues.apache.org/
>> > > > > jira/browse/HBASE-16567
>> > > > > >> >
>> > > > > >> > >>> A
>> > > > > >> > >>>>> move
>> > > > > >> > >>>>>> up on to (shaded) protobuf-3.1.0
>> > > > > >> > >>>>>> * HBASE-16741 <https://issues.apache.org/
>> > > > > jira/browse/HBASE-16741
>> > > > > >> >
>> > > > > >> > >>> An
>> > > > > >> > >>>>>> amendment of our generate
protobufs step to include
>> > shading
>> > > > > and a
>> > > > > >> > >>>>> bundling
>> > > > > >> > >>>>>> of protobuf src (with
a means of calling a patch srcs
>> > hook)
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>> Together we're talking
about 40MB of change mostly
>> made
>> > of
>> > > > the
>> > > > > >> > >>> movement
>> > > > > >> > >>>>> of
>> > > > > >> > >>>>>> generated files or the
application of a pattern that
>> > alters
>> > > > > >> where we
>> > > > > >> > >>>> get
>> > > > > >> > >>>>>> imports from. When done,
you should notice no
>> difference
>> > > and
>> > > > > >> should
>> > > > > >> > >>> be
>> > > > > >> > >>>>> able
>> > > > > >> > >>>>>> to go about your business
as per usual. Upside is
>> that we
>> > > > will
>> > > > > be
>> > > > > >> > >>> able
>> > > > > >> > >>>> to
>> > > > > >> > >>>>>> avoid coming onheap doing
protobuf
>> > > marshalling/unmarshalling
>> > > > as
>> > > > > >> > >>>> protobuf
>> > > > > >> > >>>>>> 2.5.0 requires. Downside
is that we repeat a good
>> portion
>> > > of
>> > > > > our
>> > > > > >> > >>>> internal
>> > > > > >> > >>>>>> protos, once non-shaded
so Coprocessor Endpoints can
>> keep
>> > > > > working
>> > > > > >> > >>> and
>> > > > > >> > >>>>> then
>> > > > > >> > >>>>>> again as shaded for internal
use.
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>> I provide some more overview
below on the changes. See
>> > the
>> > > > > >> shading
>> > > > > >> > >>> doc
>> > > > > >> > >>>>>> here:
>> > > > > >> > >>>>>> https://docs.google.com/document/d/
>> > > 1H4NgLXQ9Y9KejwobddCqaVME
>> > > > > >> DCGby
>> > > > > >> > >>>>> DcXtdF5iAfDIEk/edit#
>> > > > > >> > >>>>>> for more detail (Patches
are up on review board --
>> except
>> > > the
>> > > > > >> latest
>> > > > > >> > >>>>>> HBASE-16264 which is too
big for JIRA and RB). I am
>> > > currently
>> > > > > >> > >>> working
>> > > > > >> > >>>> on
>> > > > > >> > >>>>> a
>> > > > > >> > >>>>>> devs chapter for the book
on protobuf going forward
>> that
>> > > will
>> > > > > go
>> > > > > >> in
>> > > > > >> > >>> as
>> > > > > >> > >>>>> part
>> > > > > >> > >>>>>> of this patch.
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>> Thanks,
>> > > > > >> > >>>>>> St.Ack
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>> Items of note:
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>> * Two new modules; one
named hbase-protocol-shaded
>> that
>> > is
>> > > > used
>> > > > > >> by
>> > > > > >> > >>>> hbase
>> > > > > >> > >>>>>> core. It has in it a shaded
(and later patched)
>> protobuf.
>> > > The
>> > > > > >> other
>> > > > > >> > >>> new
>> > > > > >> > >>>>>> module is hbase-endpoint
which goes after hbase-server
>> > and
>> > > > has
>> > > > > >> those
>> > > > > >> > >>>>>> bundled endpoints that
I was able to break out of core
>> > > (there
>> > > > > >> are a
>> > > > > >> > >>> few
>> > > > > >> > >>>>>> that are hopelessly entangled
that need to be undone
>> as
>> > > CPEPs
>> > > > > but
>> > > > > >> > >>>>>> fortunately belong in
core: Auth, Access, MultiRow).
>> > > > > >> > >>>>>> * I've tested running
a branch-1 CPEP against a master
>> > with
>> > > > > these
>> > > > > >> > >>>>> patches
>> > > > > >> > >>>>>> in place and stuff like
ACL (A CPEP) run from the
>> > branch-1
>> > > > > shell
>> > > > > >> > >>> work
>> > > > > >> > >>>>>> against the branch-2 server.
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>>
>> > > > > >> > >>>>>>> On Mon, Aug 22, 2016
at 5:20 PM, Stack <
>> > stack@duboce.net>
>> > > > > >> wrote:
>> > > > > >> > >>>>>>>
>> > > > > >> > >>>>>>> This project goes
on. I updated HBASE-1563 "Shade
>> > > protobuf"
>> > > > > with
>> > > > > >> > >>> some
>> > > > > >> > >>>>> doc
>> > > > > >> > >>>>>>> on a final approach.
We need to be able to refer to
>> both
>> > > > > shaded
>> > > > > >> and
>> > > > > >> > >>>>>>> non-shaded protobuf
so we can support sending HDFS
>> > > > old-school
>> > > > > pb
>> > > > > >> > >>>>> Messages
>> > > > > >> > >>>>>>> but also so Coprocessor
Endpoints keep working though
>> > > > > internally
>> > > > > >> > >>>>> protobufs
>> > > > > >> > >>>>>>> have been relocated.
Funny you should ask, but yes,
>> > there
>> > > > are
>> > > > > >> some
>> > > > > >> > >>>>>>> downsides (as predicted
by contributors on the JIRA).
>> > I'd
>> > > be
>> > > > > >> > >>>> interested
>> > > > > >> > >>>>> to
>> > > > > >> > >>>>>>> hear if they are too
burdensome. In particular, your
>> IDE
>> > > > > >> experience
>> > > > > >> > >>>>> gets a
>> > > > > >> > >>>>>>> little convoluted
as you will need to add to your
>> build
>> > > > path,
>> > > > > a
>> > > > > >> jar
>> > > > > >> > >>>> with
>> > > > > >> > >>>>>>> the relocated pbs.
A pain.
>> > > > > >> > >>>>>>>
>> > > > > >> > >>>>>>> Thanks,
>> > > > > >> > >>>>>>> St.Ack
>> > > > > >> > >>>>>>>
>> > > > > >> > >>>>>>>
>> > > > > >> > >>>>>>>> On Wed, Apr 13,
2016 at 6:09 AM, Stack <
>> > stack@duboce.net
>> > > >
>> > > > > >> wrote:
>> > > > > >> > >>>>>>>>
>> > > > > >> > >>>>>>>> On Tue, Apr 12,
2016 at 9:26 PM, Sean Busbey <
>> > > > > >> busbey@apache.org>
>> > > > > >> > >>>>> wrote:
>> > > > > >> > >>>>>>>>
>> > > > > >> > >>>>>>>>>> On Tue,
Apr 12, 2016 at 6:17 PM, Stack <
>> > > stack@duboce.net
>> > > > >
>> > > > > >> > wrote:
>> > > > > >> > >>>>>>>>>>
>> > > > > >> > >>>>>>>>>>
>> > > > > >> > >>>>>>>>>> On an
initial pass, the only difficult part seems
>> to
>> > be
>> > > > > >> > >>>> interaction
>> > > > > >> > >>>>>>>>> with
>> > > > > >> > >>>>>>>>>> HDFS in
asyncwal (might just pull in the HDFS
>> > > messages).
>> > > > > >> > >>>>>>>>>>
>> > > > > >> > >>>>>>>>>>
>> > > > > >> > >>>>>>>>>
>> > > > > >> > >>>>>>>>> I have some
idea how we can make this work either
>> by
>> > > > pushing
>> > > > > >> > >>>> asyncwal
>> > > > > >> > >>>>>>>>> upstream to
HDFS or through some maven tricks,
>> > depending
>> > > > on
>> > > > > >> how
>> > > > > >> > >>> much
>> > > > > >> > >>>>>>>>> time we have.
>> > > > > >> > >>>>>>>>>
>> > > > > >> > >>>>>>>>
>> > > > > >> > >>>>>>>> Maven tricks?
Tell us more. Here or drop a note up
>> in
>> > the
>> > > > > >> issue.
>> > > > > >> > >>>>>>>> Thanks Sean,
>> > > > > >> > >>>>>>>> St.Ack
>> > > > > >> > >>>>>>>>
>> > > > > >> > >>>>>>>
>> > > > > >> > >>>>>>>
>> > > > > >> > >>>>>
>> > > > > >> > >>>>>
>> > > > > >> > >>>>>
>> > > > > >> > >>>>> --
>> > > > > >> > >>>>> busbey
>> > > > > >> > >>>>>
>> > > > > >> > >>>>
>> > > > > >> > >>>
>> > > > > >> > >>
>> > > > > >> > >>
>> > > > > >> >
>> > > > > >>
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message