hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Helmling <ghelml...@gmail.com>
Subject Re: Coprocessor Endpoints
Date Tue, 15 Mar 2011 17:11:55 GMT
Hi Jason,

That's basically correct.  To export your own RPC methods from a
coprocessor, you:

1) Define an interface containing the RPC methods.  This interface must
extend CoprocessorProtocol (which only requires you to implement
getProtocolVersion())

2) Implement the defined RPC interface in your coprocessor

3) To call the RPC methods, you have 3 options:

T HTable.coprocessorProxy(Class<T> protocol, byte[] row)

Map<byte[],R> HTable.coprocessorExec(Class<T> protocol, byte[] startRow,
byte[] endRow, Batch.Call<T,R> callable)

void HTable.coprocessorExec(Class<T> protocol, byte[] start, byte[] end,
Batch.Call<T,R> call, Batch.Callback<R> callback)


The first returns a proxy instance of T to call methods against the
coprocessor instance in the region containing the row key given by "byte[]
row", as you described.  So the row key is just used to lookup the region.

The last two are a bit tricker.  They will issue RPC calls in parallel to
all of the in the range starting with the region containing "startRow" and
ending with the region containing "endRow" (again using the row keys for the
region lookups).  However, to call the RPC methods, you still need a proxy
instance of T, which instead of being returned directly, you get access to
by providing an implementation of Batch.Call<T,R>, where T is the protocol
interface and R is the type returned from your Batch.Call implemenation's
call(T instance) method.

For each region in the range startRow to endRow, the HBase client will call
your instance of Batch.Call.call(T proxy) passing a proxy instance connected
to that region.

So in TestCoprocessorEndpoint, we define a simple RPC protocol:

public interface ColumnAggregationProtocol extends CoprocessorProtocol {
  public long sum(byte[] family, byte[] qualifier) throws IOException;
}


Then we invoke it using HTable.coprocessorExec() with an anonymous
Batch.Call instance like so:

new Batch.Call<ColumnAggregationProtocol, Long>() {
    public Long call(ColumnAggregationProtocol instance)
        throws IOException {
        return instance.sum(TEST_FAMILY, TEST_QUALIFIER);
    }
}

We then get back a Map<byte[],Long> from coprocessorExec() containing the
result of each call() invocation, keyed by the region name.

Take a look at the Batch.forMethod() helper, which will generate a
Batch.Call instance for you if you just want to invoke a single method
across multiple regions with coprocessorExec().


--gh



On Tue, Mar 15, 2011 at 9:15 AM, Jason Rutherglen <
jason.rutherglen@gmail.com> wrote:

> I'm taking a look at TestCoprocessorEndpoint for example, in trying to
> figure out how the Coprocessor RPC works.  I think
> HTable.coprocessorProxy should be used?  Which will return an
> interface that when called performs the network marshaling etc.  The
> purpose of the row byte[] in coprocessorProxy is to find the correct
> region to make the call to?
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message