hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: Looking for input on an alpha-4 thorny item
Date Fri, 06 Oct 2017 16:10:01 GMT
On Wed, Oct 4, 2017 at 3:51 PM, Andrew Purtell <apurtell@apache.org> wrote:

> I think it is fine to rebrand these interfaces as for coprocessors and tag
> them LP(COPROC):
>     Region (use HRegion in internals)
>     Store (use HStore in internals)
>     MasterServices (use HMaster in internals)
>     RegionServerServices (use HRegionServer in internals)
> and pare them down. This is not a complete list, just a handful of
> examples.
Will do. Just to say that it means a bunch of refactoring undoing reliance
on Interfaces (We have too many services and managers!)

> Some specific points of feedback I have had:
> In Region, it would be good for CPs to be able to schedule async flushes
> and compactions, poll or wait for completion of a specific request, or wait
> for all pending flushes and compactions to complete. There is a Phoenix use
> case for this in indexing.
The means are in place. Will make sure it happens.

> Security coprocessors use MasterServices to create their system tables.
> Maybe this can be replaced by using the normal admin API for same.
Yes. Working on this too.

> When removing access to internal services, consider if there are client API
> equivalents that the CP can use, and if embedded calls to such client APIs
> from the coprocessor context would be a good idea. CP invocation of an
> internal service is simply an in-process method call. That's good and bad,
> right? The bad part, direct access, is the thing we want to restrict, the
> motivation for this work (in part). But the good thing is it avoids a lot
> of the fat client logic unnecessary for all-in-process service invocation,
> which might even not work correctly. Removing everything is as drastic as
> allowing CPs access to everything. It could be fine to drastically pare
> down, but please consider it.
The short-circuit of RPC (and PB) we have where when a Client is on the
target host, the invocation is direct helps here. Moving the
RegionServerGroup feature over to the new cut-down Coprocessor API has
turned up some holes in our Admin Interface -- i.e. there is no getServers
method unless you go via ClusterReport -- that I'm backfilling so RSGroups
can use Admin Interface instead of now-removed CP API. But also, have had
to add back a few items to MasterServices in a more constrained form.

> Some changes have been proposed that removes access to metrics (e.g.
> RegionMetrics, MasterMetrics). Right now coprocessors can bypass core
> function and replace it. Until and unless we remove the bypass semantic
> (under discussion) we should continue to allow CPs access to metrics
> objects so they can update metrics as expected by admins and users when
> replacing functionality (via bypass). Metrics are a public facing API. I
> agree this is kind of dodgy. I believe we should remove the bypass
> semantic. Once that is done, coprocessors can only mix in additional
> functionality. No more cause to touch core metrics. They can export their
> own metrics if so desired.
> ​​
Will be back to ping you on this one. Yeah, I think it dodgy CPs updating
internal metrics. Will look at the radical removal of bypass too... Will be

Thanks Andrew,

> On Wed, Oct 4, 2017 at 3:15 PM, Stack <stack@duboce.net> wrote:
> ​​
> A bunch of us are making good progress on the next alpha release,
> > hbase-2.0.0-alpha-4. The theme for this release is "Fixing the
> Coprocessor
> > API", mostly undoing access accidentally granted Coprocessors. I'm
> talking
> > out loud about a particularly awkward item here rather than in a comment
> up
> > in JIRA so the airing sees a broader audience. Interested in any opinions
> > or input you might have.
> >
> > TL;DR MasterService/RegionServerService and Region, etc., Interfaces
> were
> > overloaded serving two, disparate roles; a load of refactoring has to be
> > done to undo the damage. Suggestions for how to avoid making same mistake
> > in future?
> >
> > I'm working on "HBASE-12260 MasterServices - remove from coprocessor API
> > (Discuss)". MasterServices started out as a subset of the Master
> > functionality. The idea back then was that certain Services and Managers
> > could make do w/ less-than-full-access to the HMaster process. If so, we
> > could test the Service and Manager without having to standup a full
> HMaster
> > instance (This usually required our putting up a cluster too). If
> > MasterServices had but a few methods, a Mock would be easy, making
> testing
> > easier still. We did the same thing on the RegionServer side where we had
> > RegionServerServices.
> >
> > MasterServices (and RegionServerServices) were also exposed to
> > Coprocessors. The idea was that CPs could ask-for particular Master
> > functions via MasterService. In this role MasterServices was meant to
> > constrain what CPs had access to.
> >
> > MasterServices therefore had two consumers; one internal, the other not.
> >
> > With time, MasterServices got fat as internal Services and Managers
> needed
> > more of HMaster. Everytime we added to MasterServices, CPs got access.
> >
> > On survey as part of the recent HBASE-12260 work, it turns out that the
> > bulk of the methods in MasterServices are actually annotated
> > InterfaceAudience.Private; i.e. for internal use only, not for
> > Coprocessors. A brutal purge of Private audience objects, makes for a
> > MasterServices we can pass Coprocessors but Coprocessors now have much
> less
> > facility available (for parts, there are alternatives; Andy review
> suggests
> > that CPs are severely crimped if this patch goes in). But MasterServices
> > can no longer be used for its original purpose, passing Services and
> > Managers a subset of HMaster. The latter brings on a substantial
> refactor.
> >
> > Another example of the double-role problem outlined above was found by
> Duo
> > and Anoop in the RegionServer Coprocessor refactor salt mine. They hit a
> > similar tangle. There was the RegionServerServices <=> MasterServices
> case
> > but also the exposure of HRegion internals. In this latter, Region was
> > introduced by Andy EXPLICITLY as a subset of HRegion facility FOR
> > Coprocessors. Subsequently, we all confused his original intent and went
> > ahead and thought of Region (as opposed to HRegion) as an Interface for
> > HRegion and plumbed it throughout the code base in place of explicit
> > HRegion references. As Region picked up functionality, Coprocessors
> gained
> > more access.
> >
> > The refactoring pattern that has emerged out of the RegionServer-side
> > refactoring (which is ahead of the Master-work), is that we move to use
> the
> > HRegion implementation everywhere internally, undoing use of Region
> > Interface; Region Interface picks up a "FOR COPROCESSORS ONLY" stamp. I'm
> > following suit on the Master side moving to use HMaster in place of
> > MasterServices in all launched Services and Managers.
> >
> > How do we avoid this mistake in future? Should we rename Region as
> > CoprocessorRegion so it more plain that its consumer is Coprocessors?
> Ditto
> > on MasterServices?
> >
> > Thanks,
> > S
> >
> --
> Best regards,
> Andrew
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message