hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Purtell <apurt...@apache.org>
Subject Re: [DISCUSSION] Removing the bypass semantic from the Coprocessor APIs
Date Wed, 11 Oct 2017 19:10:52 GMT
> On Phoenix Increment by-pass, an ornery item is that Phoenix wants to use
its long encoding writing Increments. Not sure how we'd do that,
selectively.

If we can handle the rest of the trouble that you observed:

1) Lack of recognition and identification of when the key value to
increment doesn't exist
2) Lack of the ability to set the timestamp of the updated key value.

then they might be able to make it work. Perhaps a conversion from HBase
native to Phoenix LONG encoding when processing results, in the wrapping
scanner, informed by schema metadata.

Or if we are keeping the bypass semantic in select places but implementing
it with something other than today's bypass() API (please) this would be
another candidate for where to keep it. Duo suggests keeping the semantic
in all of the basic RPC preXXX hooks for query and mutation. We could redo
those APIs to skip normal processing based on a return value or exception
but otherwise drop bypass from all the others. It will clean up areas of
confusion, e.g. can I bypass splits or flushes or not? Or what about this
arcane hook in compaction? Or [insert some deep hook here]? The answer
would be: only RPC hooks will early out, and only if you return this value,
or throw that exception.


On Wed, Oct 11, 2017 at 11:56 AM, Stack <stack@duboce.net> wrote:

> The YARN Timeline Server has the FlowRunCoprocessor. It does bypass when
> user does a Get returning instead the result of its own (Flow) Scan result.
> Not sure how we'd do alternative here; Timeline Server is keeping Tags
> internally.
>
>
> On Wed, Oct 11, 2017 at 10:59 AM, Andrew Purtell <apurtell@apache.org>
> wrote:
>
> > Rather than continue to support a weird bypass() which works in some
> places
> > and not in others, perhaps we can substitute it with an exception? So if
> > the coprocessor throws this exception in the pre hook then where it is
> > allowed we catch it and do the right thing, and where it is not allowed
> we
> > don't catch it and the server aborts. This will at least improve the
> silent
> > bypass() failure problem. I also don't like, in retrospect, that calling
> > this environment method has magic side effects. Everyone understands how
> > exceptions work, so it will be clearer.
> >
> >
> We could do that though throw and catch of exceptions would be costly.
>
> What about the Duo suggestion? Purge bypass flag and replace it w/ preXXX
> in a few select methods returning a boolean on whether bypass? Would that
> work? (Would have to figure metrics still).
>
>
>
> > In any case we should try to address the Tephra and Phoenix cases brought
> > up in this discussion. They look like we can find alternatives. Shall I
> > file JIRAs to follow up?
> >
> >
> >
> On Phoenix Increment by-pass, an ornery item is that Phoenix wants to use
> its long encoding writing Increments. Not sure how we'd do that,
> selectively.
>
> St.Ack
>
>
>
> > On Wed, Oct 11, 2017 at 6:00 AM, 张铎(Duo Zhang) <palomino219@gmail.com>
> > wrote:
> >
> > > These examples are great.
> > >
> > > And I think for normal region operations such as get, put, delete,
> > > checkAndXXX, increment, it is OK to bypass the real operation after
> > preXXX
> > > as the semantic is clear enough. Instead of calling env.bypass, maybe
> > just
> > > let these preXXX methods return a boolean is enough to tell the HBase
> > > framework that we have already done the real operation so just give up
> > and
> > > return?
> > >
> > > Thanks.
> > >
> > > 2017-10-11 3:19 GMT+08:00 Gary Helmling <ghelmling@gmail.com>:
> > >
> > > > The Tephra TransactionProcessor CP makes use of bypass() in
> preDelete()
> > > to
> > > > override handling of delete tombstones in a transactional way:
> > > > https://github.com/apache/incubator-tephra/blob/master/
> > > > tephra-hbase-compat-1.3/src/main/java/org/apache/tephra/
> > > hbase/coprocessor/
> > > > TransactionProcessor.java#L244
> > > >
> > > > The CDAP IncrementHandler CP also makes use of bypass() in preGetOp()
> > and
> > > > preIncrementAfterRRowLock() to provide a transaction implementation
> of
> > > > readless increments:
> > > > https://github.com/caskdata/cdap/blob/develop/cdap-hbase-
> > > > compat-1.1/src/main/java/co/cask/cdap/data2/increment/
> > > > hbase11/IncrementHandler.java#L121
> > > >
> > > > What would be the alternate approach for these applications?  In both
> > > cases
> > > > they need to impose their own semantics on the underlying KeyValue
> > > > storage.  Is there a different way this can be done?
> > > >
> > > >
> > > > On Tue, Oct 10, 2017 at 11:58 AM Anoop John <anoop.hbase@gmail.com>
> > > wrote:
> > > >
> > > > > Wrap core scanners is different right?  That can be done in post
> > > > > hooks.  I have seen many use cases for this..  Its the question abt
> > > > > the pre hooks where we have not yet created the core object (like
> > > > > scanner).  The CP pre code itself doing the work of object creation
> > > > > and so the core code is been bypassed.    Well the wrapping thing
> can
> > > > > be done in pre hook also. First create the core object by CP code
> > > > > itself and then do the wrapped object and return.. I have seen in
> one
> > > > > jira issue where the usage was this way..   The wrapping can be
> done
> > > > > in post also in such cases I believe.
> > > > >
> > > > > -Anoop-
> > > > >
> > > > > On Wed, Oct 11, 2017 at 12:23 AM, Andrew Purtell <
> > apurtell@apache.org>
> > > > > wrote:
> > > > > > I think we should continue to support overriding function by
> object
> > > > > > inheritance. I didn't mention this and am not proposing more
than
> > > > > removing
> > > > > > the bypass() sematic. No more no less. Phoenix absolutely depends
> > on
> > > > > being
> > > > > > able to wrap core scanners and return the wrappers.
> > > > > >
> > > > > >
> > > > > > On Tue, Oct 10, 2017 at 11:50 AM, Anoop John <
> > anoop.hbase@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > >> When we say bypass the core code, it can be done today not
only
> by
> > > > > >> calling bypass but by returning a not null object for some
of
> the
> > > pre
> > > > > >> hooks.  Like preScannerOpen() if it return a scanner object,
we
> > will
> > > > > >> avoid the remaining core code execution for creation of
the
> > > > > >> scanner(s).  So this proposal include this aspect also and
> remove
> > > any
> > > > > >> possible way of bypassing the core code by the CP hook code
> > > execution
> > > > > >> ?   Am +1.
> > > > > >>
> > > > > >> -Anoop-
> > > > > >>
> > > > > >> On Tue, Oct 10, 2017 at 11:40 PM, Andrew Purtell <
> > > apurtell@apache.org
> > > > >
> > > > > >> wrote:
> > > > > >> > The coprocessor API provides an environment method,
bypass(),
> > that
> > > > > when
> > > > > >> > called from a preXXX hook will cause the core code
to skip all
> > > > > remaining
> > > > > >> > processing. This capability was introduced on HBASE-3348.
> Since
> > > this
> > > > > >> time I
> > > > > >> > think we are more enlightened about the complications
of this
> > > > feature.
> > > > > >> (Or,
> > > > > >> > anyway, speaking for myself:)
> > > > > >> >
> > > > > >> > Not all hooks provide the bypass semantic. Where this
is the
> > case
> > > > the
> > > > > >> > javadoc for the hook says so, but it can be missed.
If you
> call
> > > > > bypass()
> > > > > >> in
> > > > > >> > a hook where it is not supported it is a no-op. This
can lead
> > to a
> > > > > poor
> > > > > >> > developer experience.
> > > > > >> >
> > > > > >> > Where bypass is supported what is being bypassed is
all of the
> > > core
> > > > > code
> > > > > >> > implementing the remainder of the operation. In order
to
> > > understand
> > > > > what
> > > > > >> > calling bypass() will skip, a coprocessor implementer
should
> > read
> > > > and
> > > > > >> > understand all of the remaining code and its nuances.
> Although I
> > > > think
> > > > > >> this
> > > > > >> > is good practice for coprocessor developers in general,
it
> > > demands a
> > > > > >> lot. I
> > > > > >> > think it would provide a much better developer experience
if
> we
> > > > didn't
> > > > > >> > allow bypass, even though it means - in theory - a
coprocessor
> > > would
> > > > > be a
> > > > > >> > lot more limited in some ways than before. What is
skipped is
> > > > > extremely
> > > > > >> > version dependent. That core code will vary, perhaps
> > > significantly,
> > > > > even
> > > > > >> > between point releases. We do not provide the promise
of
> > > consistent
> > > > > >> > behavior even between point releases for the bypass
semantic.
> To
> > > > > achieve
> > > > > >> > that we could not change any code between hook points.
> Therefore
> > > the
> > > > > >> > coprocessor implementer becomes an HBase core developer
in
> > > practice
> > > > as
> > > > > >> soon
> > > > > >> > as they rely on bypass(). Every release of HBase may
break the
> > > > > assumption
> > > > > >> > that the replacement for the bypassed code takes care
of all
> > > > necessary
> > > > > >> > skipped concerns. Because those concerns can change
at any
> > point,
> > > > > such an
> > > > > >> > assumption is never safe.
> > > > > >> >
> > > > > >> > I say "in theory" because I would be surprised if anyone
is
> > > relying
> > > > on
> > > > > >> the
> > > > > >> > bypass for the above reason. I seem to recall that
Phoenix
> might
> > > use
> > > > > it
> > > > > >> in
> > > > > >> > one place to promote a normal mutation into an atomic
> operation,
> > > by
> > > > > >> > substituting one for the other, but if so that objective
could
> > be
> > > > > >> > reimplemented using their new locking manager.
> > > > > >> >
> > > > > >> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message