hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Looking for input on an alpha-4 thorny item
Date Wed, 04 Oct 2017 22:15:44 GMT
A bunch of us are making good progress on the next alpha release,
hbase-2.0.0-alpha-4. The theme for this release is "Fixing the Coprocessor
API", mostly undoing access accidentally granted Coprocessors. I'm talking
out loud about a particularly awkward item here rather than in a comment up
in JIRA so the airing sees a broader audience. Interested in any opinions
or input you might have.

TL;DR MasterService/RegionServerService and Region, etc., Interfaces were
overloaded serving two, disparate roles; a load of refactoring has to be
done to undo the damage. Suggestions for how to avoid making same mistake
in future?

I'm working on "HBASE-12260 MasterServices - remove from coprocessor API
(Discuss)". MasterServices started out as a subset of the Master
functionality. The idea back then was that certain Services and Managers
could make do w/ less-than-full-access to the HMaster process. If so, we
could test the Service and Manager without having to standup a full HMaster
instance (This usually required our putting up a cluster too). If
MasterServices had but a few methods, a Mock would be easy, making testing
easier still. We did the same thing on the RegionServer side where we had

MasterServices (and RegionServerServices) were also exposed to
Coprocessors. The idea was that CPs could ask-for particular Master
functions via MasterService. In this role MasterServices was meant to
constrain what CPs had access to.

MasterServices therefore had two consumers; one internal, the other not.

With time, MasterServices got fat as internal Services and Managers needed
more of HMaster. Everytime we added to MasterServices, CPs got access.

On survey as part of the recent HBASE-12260 work, it turns out that the
bulk of the methods in MasterServices are actually annotated
InterfaceAudience.Private; i.e. for internal use only, not for
Coprocessors. A brutal purge of Private audience objects, makes for a
MasterServices we can pass Coprocessors but Coprocessors now have much less
facility available (for parts, there are alternatives; Andy review suggests
that CPs are severely crimped if this patch goes in). But MasterServices
can no longer be used for its original purpose, passing Services and
Managers a subset of HMaster. The latter brings on a substantial refactor.

Another example of the double-role problem outlined above was found by Duo
and Anoop in the RegionServer Coprocessor refactor salt mine. They hit a
similar tangle. There was the RegionServerServices <=> MasterServices case
but also the exposure of HRegion internals. In this latter, Region was
introduced by Andy EXPLICITLY as a subset of HRegion facility FOR
Coprocessors. Subsequently, we all confused his original intent and went
ahead and thought of Region (as opposed to HRegion) as an Interface for
HRegion and plumbed it throughout the code base in place of explicit
HRegion references. As Region picked up functionality, Coprocessors gained
more access.

The refactoring pattern that has emerged out of the RegionServer-side
refactoring (which is ahead of the Master-work), is that we move to use the
HRegion implementation everywhere internally, undoing use of Region
Interface; Region Interface picks up a "FOR COPROCESSORS ONLY" stamp. I'm
following suit on the Master side moving to use HMaster in place of
MasterServices in all launched Services and Managers.

How do we avoid this mistake in future? Should we rename Region as
CoprocessorRegion so it more plain that its consumer is Coprocessors? Ditto
on MasterServices?


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message