hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From st...@duboce.net
Subject Re: Review Request: HBase-2001: Coprocessors: Colocate user code with regions
Date Wed, 06 Oct 2010 06:10:58 GMT

This is an automatically generated e-mail. To reply, visit:

I went through about 2/3rds of the classes particular to hbase-2001 and made some comments.
  As stated in the below, this functionality is pretty amazing. IMO, it could do with a few
more rounds of new-patch/review before ready for commit.


    I took a look at the package-info.html.  Very nice doc.  One thought though was that the
batch methods do not seem to be instrumented.  Are they?  The bulk of inserts are done by
multiput now.
    Maybe link to the wiki page when you say this in package-info.html '....implement role-based
access control for HBase'
    Fix this 'These code will be triggerd with existing...'
    BaseRegionObserver as the name of the class that implements BOTH Coprocessor and RegionObserver
with sensible defaults seems off... it'd make sense as the name of an implemenation of RegionObserver
but not of both.  Is there a better name to give it -- even BaseRegionObserverCoprocessor?
 Unless BaseObserver already implements Coprocessor?
    Should this also say that methods can be new also?  '...i.e., you can specify new passed
parameters and return types for a method. '
    CommandTarget is a strange name for an host of arbitrary user-designed methods.  Can we
come up w/ something more telling?   Notions that come to mind are Substrate, Platform --
i.e. stuff you build up on.
    Minor.. fix '...the actually implemention class running...'
    Fix this '...How is the client side example of calling...'
    The example is missing a bit of code that would help along its illustration.... a few
comments would help too.... but this is a minor criticism.  Not important.  I get the gist
(Folks interested in CP need to start with this page -- it makes grokking the code the easier).
    This page would seem to indicate CPs can be chained.  Am I reading that wrong?  (See 'Load
from configuration')  Over in Gary review, he was saying on CP per region only.
    Usually attribute names are upper-cased.  Here we have 'Coprocessor$1' (that $1is intentional
    This functionality, if its working, is amazing.


    Fix this line.


    What is this result if its a pre call?  Whats in it? Is it empty?  Would a pre call add
extras to the Result or something?


    This would seem to disagree with the package-info doc.  Is that intentional?  Doc is stale?


    Is this per region?  Multiple CPs per region?


    In previous review I think I asked where's the chance to reject or amend each key that
passes through the compaction?  (This was probably answered already -- need to check back
in reviews)
    What if I wanted to stop the compaction?  Thats too intrusive?  Same for split and flush?


    Missing license




    How does this work if its a pre call, how can it return a result?


    Should this method be passed the result of actual call to getClosestRowBefore?




    Whats missing from this interface?
    Should this be done via reflection somehow given as folks will probably be adding and
removing methods over time?  It'll be a bit of a pain keeping the two in sync otherwise?


    Should say something here no?  I could imagine that fellas could be pulling their hair
out trying to figure why their CP didn't load yet nothing printed in log.


    So any CP can amend Result.  I've yet to look at the amended HRegion?  It'll adorn the
passed in Result rather than create one of its own?


    Its intentional that this went away?


    Oh, I see whats going on... fair enough.


    CoprocessorHost is a good name for what it does.


    If the pre call put something into Result, this will overwrite or lose it?




    NOt important but I'd swap these.


    Ok, looks like pre can adorn get results... its being handled properly here.

- stack

On 2010-10-05 18:22:16, Mingjie Lai wrote:
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/876/
> -----------------------------------------------------------
> (Updated 2010-10-05 18:22:16)
> Review request for hbase, stack, Andrew Purtell, and Jonathan Gray.
> Summary
> -------
> The diff actually contains 2 seperate patches: HBase-2001 and the one for (HBASE-2002+HBASE-2321).
The reason is that HBase-2001's CommandTarget relies on HBASE-2002 + HBASE-2321 which patches
are still under review. I have to include Gary's HBASE-2002, HBASE-2321 with this diff, since
reviewboard is so powerful :) and it disallow my diff to be based on some unchecked in patch.

> Eventually the patch here should be committed after 2001 and 2321. I will make another
patch after they got checked in. 
> Both HBase-2001 and the dynamic RPC stuff are quite big patches. Total number of lines
are more than 7k. I turned back and forth, but still don't have a good idea to create the
patch in order to reduce the review pain. However right now I'm putting the whole patch for
all the 3 issues. Here the list of file which are only related to coprocessor:
> src/main/java/org/apache/hadoop/hbase/coprocessor/BaseCommandTarget.java
> src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java
> src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
> src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java
> src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorException.java
> src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
> src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java
> src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
> src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
> src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
> src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java
> src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java
> src/main/resources/hbase-default.xml
> src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassloading.java
> src/test/java/org/apache/hadoop/hbase/coprocessor/TestCommandTarget.java
> src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
> src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
> src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java
> ==========================
> (Here is a brief description. Please find much more details at the package-info.java
in the diff. I also post the package-info.html to https://issues.apache.org/jira/browse/HBASE-2001
as an attachment.)
> Coprocessors are code that runs in-process on each region server. Regions contain references
to the coprocessor implementation classes associated with them. Coprocessor classes will be
loaded either from local jars on the region server's classpath or via the HDFS classloader.
> Multiple types of coprocessors are provided to provide sufficient flexibility for potential
use cases. Right now there are:
> * Coprocessor: provides region lifecycle management hooks, e.g., region open/close/split/flush/compact
> * RegionObserver: provides hook for monitor table operations from client side, such as
table get/put/scan/delete, etc.
> * CommandTarget: provides on demand triggers for any arbitrary function executed at a
region. One use case is column aggregation at region server.
> Coprocessor:
> A coprocessor is required to implement Coprocessor interface so that coprocessor framework
can manage it internally.
> Another design goal of this interface is to provide simple features for making coprocessors
useful, while exposing no more internal state or control actions of the region server than
necessary and not exposing them directly. 
> RegionObserver
> If the coprocessor implements the RegionObserver interface it can observe and mediate
client actions on the region. 
> CommandTarget:
> Coprocessor and RegionObserver provide certain hooks for injecting user code running
at each region. These code will be triggerd with existing HTable and HBaseAdmin operations
at the certain hook points.
> Through CommandTarget and dynamic RPC protocol, you can define your own interface communicated
between client and region server, i.e., you can specify new passed parameters and return types
for a method. And the new CommandTarget methods can be triggered by calling client side dynamic
RPC functions -- HTable.exec(...). 
> Coprocess loading
> A customized coprocessor can be loaded by two different ways, by configuration, or by
HTableDescriptor for a newly created table.
> (Currently we don't really have an on demand coprocessor loading machanism for opened
regions. However it should be easy to create a dedicated CommandTarget for coprocessor loading)

> This addresses bug HBase-2001.
>     http://issues.apache.org/jira/browse/HBase-2001
> Diffs
> -----
>   src/main/java/org/apache/hadoop/hbase/client/Action.java 556ea81 
>   src/main/java/org/apache/hadoop/hbase/client/Batch.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/Exec.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/ExecResult.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/HConnection.java 65f7618 
>   src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 866ba92 
>   src/main/java/org/apache/hadoop/hbase/client/HTable.java 0dbf263 
>   src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 74593bf 
>   src/main/java/org/apache/hadoop/hbase/client/MultiAction.java c6ea838 
>   src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java 91bd04b 
>   src/main/java/org/apache/hadoop/hbase/client/RowRange.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/Scan.java 29b3cb0 
>   src/main/java/org/apache/hadoop/hbase/coprocessor/BaseCommandTarget.java PRE-CREATION

>   src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserver.java PRE-CREATION

>   src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java PRE-CREATION

>   src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorException.java PRE-CREATION

>   src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java PRE-CREATION

>   src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 83f623d 
>   src/main/java/org/apache/hadoop/hbase/ipc/ConnectionHeader.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/ExecRPCInvoker.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 2b5eeb6 
>   src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java e23a629 
>   src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java e4c356d 
>   src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java ee5dd8f 
>   src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/RpcEngine.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/master/HMaster.java fb1e834 
>   src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java PRE-CREATION

>   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 5f829e4 
>   src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 89f499a 
>   src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java 4ba63d8

>   src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 9aaf7c3 
>   src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java 01475d5

>   src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java d4166cf 
>   src/main/resources/hbase-default.xml 5fafe65 
>   src/test/java/org/apache/hadoop/hbase/coprocessor/TestClassloading.java PRE-CREATION

>   src/test/java/org/apache/hadoop/hbase/coprocessor/TestCommandTarget.java PRE-CREATION

>   src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java PRE-CREATION

>   src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
>   src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java PRE-CREATION

>   src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java PRE-CREATION

> Diff: http://review.cloudera.org/r/876/diff
> Testing
> -------
> Thanks,
> Mingjie

View raw message