hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15638) Shade protobuf
Date Tue, 04 Oct 2016 05:19:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15544383#comment-15544383

stack commented on HBASE-15638:

I just pushed the below set of patches:

   HBASE-15638 Shade protobuf
    Which includes

        HBASE-16742 Add chapter for devs on how we do protobufs going forward

        HBASE-16741 Amend the generate protobufs out-of-band build step
        to include shade, pulling in protobuf source and a hook for patching protobuf

        Removed ByteStringer from hbase-protocol-shaded. Use the protobuf-3.1.0
        trick directly instead. Makes stuff cleaner. All under 'shaded' dir is
        now generated.

        HBASE-16567 Upgrade to protobuf-3.1.x
        Regenerate all protos in this module with protoc3.
        Redo ByteStringer to use new pb3.1.0 unsafebytesutil
        instead of HBaseZeroCopyByteString

        HBASE-16264 Figure how to deal with endpoints and shaded pb Shade our protobufs.
        Do it in a manner that makes it so we can still have in our API references to
        com.google.protobuf (and in REST). The c.g.p in API is for Coprocessor Endpoints (CPEP)

                This patch is Tactic #4 from Shading Doc attached to the referenced issue.
                Figuring an appoach took a while because we have Coprocessor Endpoints
                mixed in with the core of HBase that are tough to untangle (FIX).

                Tactic #4 (the fourth attempt at addressing this issue) is COPY all but
                the CPEP .proto files currently in hbase-protocol to a new module named
                hbase-protocol-shaded. Generate .protos again in the new location and
                then relocate/shade the generated files. Let CPEPs keep on with the
                old references at com.google.protobuf.* and
                org.apache.hadoop.hbase.protobuf.* but change the hbase core so all
                instead refer to the relocated files in their new location at

                Let the new module also shade protobufs themselves and change hbase
                core to pick up this shaded protobuf rather than directly reference

                This approach allows us to explicitly refer to either the shaded or
                non-shaded version of a protobuf class in any particular context (though
                usually context dictates one or the other). Core runs on shaded protobuf.
                CPEPs continue to use whatever is on the classpath with
                com.google.protobuf.* which is pb2.5.0 for the near future at least.

                See above cited doc for follow-ons and downsides. In short, IDEs will complain
                about not being able to find the shaded protobufs since shading happens at
                time; will fix by checking in all generated classes and relocated protobuf
                a follow-on. Also, CPEPs currently suffer an extra-copy as marshalled from
                non-shaded to shaded. To fix. Finally, our .protos are duplicated; once
                shaded, and once not. Pain, but how else to reveal our protos to CPEPs or
                C++ client that wants to talk with HBase AND shade protobuf.


                Add a new hbase-protocol-shaded module. It is a copy of hbase-protocol
        i       with all relocated offset from o.a.h.h. to o.a.h.h.shaded. The new module
                also includes the relocated pb. It does not include CPEPs. They stay in
                their old location.

> Shade protobuf
> --------------
>                 Key: HBASE-15638
>                 URL: https://issues.apache.org/jira/browse/HBASE-15638
>             Project: HBase
>          Issue Type: Bug
>          Components: Protobufs
>            Reporter: stack
>            Assignee: stack
>            Priority: Critical
>         Attachments: 15638v2.patch, HBASE-15638.master.001.patch, HBASE-15638.master.002.patch,
HBASE-15638.master.003 (1).patch, HBASE-15638.master.003 (1).patch, HBASE-15638.master.003
(1).patch, HBASE-15638.master.003.patch, HBASE-15638.master.003.patch, HBASE-15638.master.004.patch,
HBASE-15638.master.005.patch, HBASE-15638.master.006.patch, HBASE-15638.master.007.patch,
HBASE-15638.master.007.patch, HBASE-15638.master.008.patch, HBASE-15638.master.009.patch,
> We need to change our protobuf. Currently it is pb2.5.0. As is, protobufs expect all
buffers to be on-heap byte arrays. It does not have facility for dealing in ByteBuffers and
off-heap ByteBuffers in particular. This fact frustrates the off-heaping-of-the-write-path
project as marshalling/unmarshalling of protobufs involves a copy on-heap first.
> So, we need to patch our protobuf so it supports off-heap ByteBuffers. To ensure we pick
up the patched protobuf always, we need to relocate/shade our protobuf and adjust all protobuf
references accordingly.
> Given as we have protobufs in our public facing API, Coprocessor Endpoints -- which use
protobuf Service to describe new API -- a blind relocation/shading of com.google.protobuf.*
will break our API for CoProcessor EndPoints (CPEP) in particular. For example, in the Table
Interface, to invoke a method on a registered CPEP, we have:
> {code}<T extends com.google.protobuf.Service,R> Map<byte[],R> coprocessorService(
> Class<T> service, byte[] startKey, byte[] endKey,                             
               org.apache.hadoop.hbase.client.coprocessor.Batch.Call<T,R> callable)
> throws com.google.protobuf.ServiceException, Throwable{code}
> This issue is how we intend to shade protobuf for hbase-2.0.0 while preserving our API
as is so CPEPs continue to work on the new hbase.

This message was sent by Atlassian JIRA

View raw message