hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Lipcon <t...@cloudera.com>
Subject Re: DISCUSSION: 1.0.0
Date Wed, 12 Feb 2014 21:22:39 GMT
On Wed, Feb 12, 2014 at 12:46 PM, Gary Helmling <ghelmling@gmail.com> wrote:

> >
> > 'Repurpose' might not be the way I would put it.
> >
> > Coprocessors were and are a means for internal server extension by mixin.
> > The original problem we solved was needing to subclass HRegionServer and
> > other classes to extend core HBase functions, but having more than one
> > otherwise orthogonal extension that users want to use. Now we can mix in
> > multiple extensions with a framework that has some simple rules for
> > cooperation between the extensions.
> >
> > We return to the earlier state of affairs with modules. Sure, we can plug
> > in an alternate behavior with a module that subclasses and extends the
> > default, say flush strategy, but we can't then instantiate multiple
> modules
> > into the same slot, both subclassing the same base but doing different
> > things.
> >
> >
> I agree the ability to compose coprocessors in order to extend behavior is
> a key capability that we should not throw out.
> I think the current Observer APIs could probably do with a bit of
> reorganization to make them a little more accessible and comprehensible.  I
> think there is also an emerging need to see if we can define some subset of
> these APIs that we can stabilize for easier public consumption, while
> keeping the rest of the APIs free to evolve as needed as HBase internals
> change (since these are an extension mechanism for internal behaviors).
>  I'm not sure we've really seen enough commonality emerge yet to say what
> those APIs are though.  We could try to define the public subset as those
> involved in client requests, but flush and compaction, for example, can
> also be triggered by client requests.  And my own use of coprocessor APIs
> lately has been focused on overriding the flush and compaction behaviors,
> not on client requests.
> I think the best place to start is by breaking up some of the current APIs,
> grouping them around behaviors or areas of functionality.  Whether we call
> some of these "coprocessors" and others "plugins" is a question of
> branding.  I do think it's important to figure out which we can stabilize
> and offer longer term contracts for.  But whatever we call them, I strongly
> agree that we should maintain the "mixin" / composition approach and not
> return to a simple fixed inheritance scheme.

I've always considered coprocessors to be the "kernel modules" of the HBase
world. They give you way more power than user-space programming, but come
with the cautions that if you make a mistake, you'll crash your whole
system or trigger unexpected behavior.

Given that, I don't think we should really be spending too much effort on
coprocessor API stability. If we make this a requirement, it can hamper the
ability of the HBase core developers to make good changes which really
improve the system. I don't think we're at the level of maturity as a
project where this is the right tradeoff, as of yet.

For what it's worth, the Linux kernel module API is also not
stable/compatible between versions. This document is a good read:

I do think we should seek to keep the interfaces stable through *patch*
level releases -- a bug fix shouldn't break a coprocessor API. But between
minor releases that add new features, it seems like an unnecessary

Todd Lipcon
Software Engineer, Cloudera

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message