hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2001) Coprocessors: Colocate user code with regions
Date Mon, 08 Nov 2010 22:44:23 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929784#action_12929784
] 

HBase Review Board commented on HBASE-2001:
-------------------------------------------

Message from: "Mingjie Lai" <mjlai09@gmail.com>

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/876/
-----------------------------------------------------------

(Updated 2010-11-08 14:41:41.575886)


Review request for hbase, stack, Andrew Purtell, and Jonathan Gray.


Changes
-------

Changes:
- Addressed Stack's comments.
  - Renamed CommandTarget to Endpoint (still up to decide whether it's better or not)
  - Improved package-info. 

- Refined some of RegionObserver method signatures, i.e., preGet(), etc. 
- Added hooks for HTable.increment(Increment).


Summary (updated)
-------

The diff actually contains 2 seperate patches: HBase-2001 and the one for (HBASE-2002+HBASE-2321).
The reason is that HBase-2001's CommandTarget relies on HBASE-2002 + HBASE-2321 which patches
are still under review. I have to include Gary's HBASE-2002, HBASE-2321 with this diff, since
reviewboard is so powerful :) and it disallow my diff to be based on some unchecked in patch.


Eventually the patch here should be committed after 2001 and 2321. I will make another patch
after they got checked in. 

Both HBase-2001 and the dynamic RPC stuff are quite big patches. Total number of lines are
more than 7k. I turned back and forth, but still don't have a good idea to create the patch
in order to reduce the review pain. However right now I'm putting the whole patch for all
the 3 issues. Here the list of file which are only related to coprocessor:

src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java
src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorException.java
src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java
src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java
src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java
src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationProtocol.java
src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java
src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java


==========================

(Here is a brief description. Please find much more details at the package-info.java in the
diff. I also post the package-info.html to https://issues.apache.org/jira/browse/HBASE-2001
as an attachment.)


Coprocessors are code that runs in-process on each region server. Regions contain references
to the coprocessor implementation classes associated with them. Coprocessor classes will be
loaded either from local jars on the region server's classpath or via the HDFS classloader.

Multiple types of coprocessors are provided to provide sufficient flexibility for potential
use cases. Right now there are:

* Coprocessor: provides region lifecycle management hooks, e.g., region open/close/split/flush/compact
operations.
* RegionObserver: provides hook for monitor table operations from client side, such as table
get/put/scan/delete, etc.
* Endpoint: provides on demand triggers for any arbitrary function executed at a region. One
use case is column aggregation at region server.

Coprocessor:
A coprocessor is required to implement Coprocessor interface so that coprocessor framework
can manage it internally.

Another design goal of this interface is to provide simple features for making coprocessors
useful, while exposing no more internal state or control actions of the region server than
necessary and not exposing them directly. 

RegionObserver
If the coprocessor implements the RegionObserver interface it can observe and mediate client
actions on the region. 

Endpoint:
Coprocessor and RegionObserver provide certain hooks for injecting user code running at each
region. These code will be triggerd with existing HTable and HBaseAdmin operations at the
certain hook points.

Through Endpoint and dynamic RPC protocol, you can define your own interface communicated
between client and region server, i.e., you can create a new method, specify passed parameters
and return types for the method. And the new Endpoint methods can be triggered by calling
client side dynamic RPC functions -- HTable.exec(...). 

Coprocess loading
A customized coprocessor can be loaded by two different ways, by configuration, or by HTableDescriptor
for a newly created table.

(Currently we don't really have an on demand coprocessor loading machanism for opened regions.
However it should be easy to create a dedicated CommandTarget for coprocessor loading) 


This addresses bug HBase-2001.
    http://issues.apache.org/jira/browse/HBase-2001


Diffs (updated)
-----

  src/main/java/org/apache/hadoop/hbase/client/Action.java 556ea81 
  src/main/java/org/apache/hadoop/hbase/client/HConnection.java 3235a9b 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 1748a4f 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java 69dc916 
  src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java ade3e21 
  src/main/java/org/apache/hadoop/hbase/client/MultiAction.java c6ea838 
  src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java 1ac7671 
  src/main/java/org/apache/hadoop/hbase/client/coprocessor/Batch.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/client/coprocessor/Exec.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/client/coprocessor/ExecResult.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/client/coprocessor/package-info.java PRE-CREATION

  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseEndpointCoprocessor.java PRE-CREATION

  src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java PRE-CREATION

  src/main/java/org/apache/hadoop/hbase/coprocessor/Coprocessor.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java PRE-CREATION

  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorException.java PRE-CREATION

  src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/coprocessor/package-info.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 30dd877 
  src/main/java/org/apache/hadoop/hbase/ipc/ConnectionHeader.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/ipc/ExecRPCInvoker.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java f8d8af7 
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java 4f4828b 
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java e4c356d 
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java cf30cf9 
  src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/ipc/RpcEngine.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java 209c291 
  src/main/java/org/apache/hadoop/hbase/regionserver/CoprocessorHost.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java dd2955d 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java ab183e5 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java 1309f93 
  src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java 1bcde8c 
  src/main/java/org/apache/hadoop/hbase/regionserver/handler/OpenRegionHandler.java ace7997

  src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java 5f05079 
  src/main/resources/hbase-default.xml 5fafe65 
  src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationEndpoint.java PRE-CREATION

  src/test/java/org/apache/hadoop/hbase/coprocessor/ColumnAggregationProtocol.java PRE-CREATION

  src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java PRE-CREATION

  src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorEndpoint.java PRE-CREATION

  src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java PRE-CREATION

  src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java PRE-CREATION

  src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java PRE-CREATION

  src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java PRE-CREATION


Diff: http://review.cloudera.org/r/876/diff


Testing
-------


Thanks,

Mingjie




> Coprocessors: Colocate user code with regions
> ---------------------------------------------
>
>                 Key: HBASE-2001
>                 URL: https://issues.apache.org/jira/browse/HBASE-2001
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: Andrew Purtell
>            Assignee: Mingjie Lai
>             Fix For: 0.92.0
>
>         Attachments: asm-transformations.pdf, HBASE-2001-RegionObserver-2.patch, HBASE-2001-RegionObserver.patch,
HBASE-2001.patch.gz, packge-info.html, packge-info.html
>
>
> Support user code that runs run next to each region in table. As regions split and move,
coprocessor code should automatically  move also.
> Use classloader which looks on HDFS.
> Associate a list of classes to load with each table. Put this in HRI so it inherits from
table but can be changed on a per region basis (so then those region specific changes can
inherited by daughters). 
> Not completely arbitrary code, should require implementation of an interface with callbacks
for:
> * Open
> * Close
> * Split
> * Compact
> * (Multi)get and scanner next()
> * (Multi)put
> * (Multi)delete
> Add method to HRegionInterface for invoking coprocessor methods and retrieving results.
 
> Add methods in o.a.h.h.regionserver or subpackage which implement convenience functions
for coprocessor methods and consistent/controlled access to internals: store access, threading,
persistent and ephemeral state, scratch storage, etc. 
> GitHub: http://github.com/mlai/hbase/tree/0.90_coprocessor

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message