hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Helmling" <ghelml...@gmail.com>
Subject Re: Review Request: [HBASE-2321] [HBASE-2002] Add support for per-region dynamically registered RPC endpoints for coprocessors and allow configurable RPC client/server implementations
Date Wed, 06 Oct 2010 07:32:11 GMT


> On 2010-10-04 23:09:49, stack wrote:
> > src/main/java/org/apache/hadoop/hbase/client/Exec.java, line 65
> > <http://review.cloudera.org/r/816/diff/9/?file=13681#file13681line65>
> >
> >     This is fair I suppose if only one coprocessor per region (Is that true)?
> 
> Gary Helmling wrote:
>     Correct, only a single handler can be registered per CoprocessorProtocol subclass
per region.
> 
> stack wrote:
>     Is that how we want it to always be?  Chaining processors would just be a nightmare?

Sorry to not be clear.  Coprocessors and RegionObservers can be chained in a single region,
no problem.  

What I was referring to is HRegion.registerProtocol(), which registers a CoprocessorProtocol
subclass implementation for handling Exec calls.  This allows only one handler (instance)
to be registered per CoprocessorProtocol subclass (ex. PingProtocol, CountRowProtocol, etc).
 Without this limitation, I'm not sure how you'd differentiate return values of different
handlers.

But Coprocessors/RegionObservers are not required to implement CoprocessorProtocol, so it's
not a direct restriction on them.


> On 2010-10-04 23:09:49, stack wrote:
> > src/main/java/org/apache/hadoop/hbase/client/Exec.java, line 35
> > <http://review.cloudera.org/r/816/diff/9/?file=13681#file13681line35>
> >
> >     ... against a Coprocessor?
> >     
> >     Maybe add above?
> >
> 
> Gary Helmling wrote:
>     Yes, could rename CoprocessorExec for clarity.  It's fairly generic but there's no
other usage.
>     
>     I guess in the naming here and elsewhere, I was envisioning Coprocessors as the sort
of stored procedures of HBase.  A basic functionality -- execute this user code -- with coprocessors
as the implementation.  So I took a general approach to naming the client interface.  It seemed
to fit in with the basic operations: Get, Scan, Put, Delete, Exec.
>     
>     But if this is overly general and confusing, I have no problem renaming this and
the other client bits with a "Coprocessor" prefix.
>     
>     Will definitely improve javadoc here as well.
> 
> stack wrote:
>     Ok. That explains a bunch (regards naming).
>     
>     Can you think of a place where Exec would be used anywhere else?  It doesn't seem
tied to coprocessors if I recall.  You could say, in javadoc, "used for example by coprocessors"
>     
>     Batch though I think doesn't fit your Get, Scan, Put, Delete, Exec set because it
strikes me as a Batch of Gets/Scans/Puts... which it is not.  I'm bad w/ names or would suggest
something.  Would be a shame to give it all Coprocessor prefix if later it could be used for
other things.

Agreed on Batch not fitting in and Exec isn't really generalizable as it stands, since it
references 
Class<? extends CoprocessorProtocol> as the protocol interface.

I'm leaning towards prefixing all with Coprocessor: CoprocessorBatch, CoprocessorExec, CoprocessorExecResult...
 A subpackage seems a bit clunky, though, despite the nicety of being able to add a package-info
doc.

Will have a patch with revised naming tomorrow.


> On 2010-10-04 23:09:49, stack wrote:
> > src/main/java/org/apache/hadoop/hbase/client/Batch.java, line 50
> > <http://review.cloudera.org/r/816/diff/9/?file=13680#file13680line50>
> >
> >     cool
> >     
> >     But, this method's name is 'returning'?
> >     
> >     So, it executes the 'method' of 'protocol' and returns the hosting  'Call' whose
invocation has already run?
> >     
> >     Should it be 'execute' or 'executeCall' or 'invokeCall', etc.
> 
> Gary Helmling wrote:
>     This just returns a Batch.Call instance, whose call() method will invoke the specified
CoprocessorProtocol method.  So it returns a Batch.Call that returns the method result.  At
this point the remote invocation has not yet happened.  That won't occur until down in HConnectionManager.HConnectionImplementation.processExecs().
 Reached through passing the Batch.Call instance to HTable.exec(...).
>     
>     Yeah, "returning()" is a little generic too.  I could rename to forMethod() or callingMethod()?
> 
> stack wrote:
>     forExec?  I don't remember if the javadoc ties this to HTable.exec... it probably
does.  If it don't, it should.  Maybe even an 'example'?
>     
>     Good naming and where it is located in the hierarchy will help folks grok how to
use it all.

I'll tackling renaming this with the classes.  Good point on the javadoc, I'll cross-ref with
HTable.exec() and include example usage.


- Gary


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/816/#review1414
-----------------------------------------------------------


On 2010-10-04 16:28:39, Gary Helmling wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://review.cloudera.org/r/816/
> -----------------------------------------------------------
> 
> (Updated 2010-10-04 16:28:39)
> 
> 
> Review request for hbase, Andrew Purtell and Jonathan Gray.
> 
> 
> Summary
> -------
> 
> This is really two separate patches in one, though with some overlapping changes.  If
necessary I can split them apart for separate review.  Please let me know if that would make
review easier.
> 
> Part 1:
> ==============
> Port over of HADOOP-6422 to the HBase RPC code.  The goal of this change is to allow
alternate RPC client/server implementations to be enabled through a simple configuration change.
 Ultimately I would like to use this to allow secure RPC to be enabled through configuration,
while not blocking normal (current) RPC operation on non-secure Hadoop versions.
> 
> This portion of the patch abstracts out two interfaces from the RPC code:
> 
> RpcEngine: HBaseRPC uses this to obtain proxy instances for client calls and server instances
for HMaster and HRegionServer
> RpcServer: this allows differing RPC server implementations, breaking the dependency
on HBaseServer
> 
> The bulk of the current code from HBaseRPC is moved into WritableRpcEngine and is unchanged
other than the interface requirements.  So the current call path remains the same, other than
the HBaseRPC.getProtocolEngine() abstraction.
> 
> 
> Part 2:
> ===============
> The remaining changes provide server-side hooks for registering new RPC protocols/handlers
(per-region to support coprocessors), and client side hooks to support dynamic execution of
the registered protocols.
> 
> The new RPC protocol actions are constrained to org.apache.hadoop.hbase.ipc.CoprocessorProtocol
implementations (which extends VersionedProtocol) to prevent arbitrary execution of methods
against HMasterInterface, HRegionInterface, etc.
> 
> For protocol handler registration, HRegionServer provides a new method:
> 
>   public <T extends CoprocessorProtocol> boolean registerProtocol(
>       byte[] region, Class<T> protocol, T handler)
> 
> which builds a Map of region name to protocol instances for dispatching client calls.
> 
> 
> Client invocations are performed through HTable, which adds the following methods:
> 
> 
>   public <T extends CoprocessorProtocol> T proxy(Class<T> protocol, Row row)
> 
> This directly returns a proxy instance to the CoprocessorProtocol implementation registered
for the region serving row "row".  Any method calls will be proxied to the region's server
and invoked using the map of registered region name -> handler instances.
> 
> Calls directed against multiple rows are a bit more complicated.  They are supported
with the methods:
> 
>   public <T extends CoprocessorProtocol, R> void exec(
>       Class<T> protocol, List<? extends Row> rows,
>       BatchCall<T,R> callable, BatchCallback<R> callback)
> 
>   public <T extends CoprocessorProtocol, R> void exec(
>       Class<T> protocol, RowRange range,
>       BatchCall<T,R> callable, BatchCallback<R> callback)
> 
> where BatchCall and BatchCallback are simple interfaces defining the methods to be called
and a callback instance to be invoked for each result.
> 
> For the sample CoprocessorProtocol interface:
> 
>   interface PingProtocol extends CoprocessorProtocol {
>     public String ping();
>     public String hello(String name);
>   }
> 
> a client invocation might look like:
> 
>     final Map<byte[],R> results = new TreeMap<byte[],R>(...)
>     List<Row> rows = ...
>     table.exec(PingProtocol.class, rows,
>         new HTable.BatchCall<PingProtocol,String>() {
>           public String call(PingProtocol instance) {
>             return instance.ping();
>           }
>         },
>         new BatchCallback<R>(){
>           public void update(byte[] region, byte[] row, R value) {
>             results.put(region, value);
>           }
>         });
> 
> The BatchCall.call() method will be invoked for each row in the passed in list, and the
BatchCallback.update() method will be invoked for each return value.  However, currently the
PingProtocol.ping() invocation will result in a separate RPC call per row, which is less that
ideal.
> 
> Support is in place to make use of the HRegionServer.multi() invocations for batched
RPC (see the org.apache.hadoop.hbase.client.Exec class), but this does not mesh well with
the current client-side interface.
> 
> In addition to standard code review, I'd appreciate any thoughts on the client interactions
in particular, and whether they would meet some of the anticipated uses of coprocessors.
> 
> 
> This addresses bug HBASE-2002.
>     http://issues.apache.org/jira/browse/HBASE-2002
> 
> 
> Diffs
> -----
> 
>   src/main/java/org/apache/hadoop/hbase/client/Action.java 556ea81 
>   src/main/java/org/apache/hadoop/hbase/client/Batch.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/Exec.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/ExecResult.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/HConnection.java 65f7618 
>   src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 866ba92 
>   src/main/java/org/apache/hadoop/hbase/client/HTable.java 0dbf263 
>   src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 74593bf 
>   src/main/java/org/apache/hadoop/hbase/client/MultiAction.java c6ea838 
>   src/main/java/org/apache/hadoop/hbase/client/MultiResponse.java 91bd04b 
>   src/main/java/org/apache/hadoop/hbase/client/RowRange.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/client/Scan.java 29b3cb0 
>   src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 83f623d 
>   src/main/java/org/apache/hadoop/hbase/ipc/ConnectionHeader.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/CoprocessorProtocol.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/ExecRPCInvoker.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/HBaseClient.java 2b5eeb6 
>   src/main/java/org/apache/hadoop/hbase/ipc/HBaseRPC.java e23a629 
>   src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java e4c356d 
>   src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java ee5dd8f 
>   src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/RpcEngine.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/ipc/WritableRpcEngine.java PRE-CREATION 
>   src/main/java/org/apache/hadoop/hbase/master/HMaster.java fb1e834 
>   src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 0a4fbce 
>   src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 89f499a 
>   src/main/java/org/apache/hadoop/hbase/rest/client/RemoteHTable.java d4166cf 
>   src/main/resources/hbase-default.xml 5fafe65 
>   src/test/java/org/apache/hadoop/hbase/regionserver/TestServerCustomProtocol.java PRE-CREATION

> 
> Diff: http://review.cloudera.org/r/816/diff
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gary
> 
>


Mime
View raw message