hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9291) Enable client to setAttribute that is sent once to each region server
Date Thu, 02 Jan 2014 20:33:51 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860731#comment-13860731

Andrew Purtell commented on HBASE-9291:

First let me clarify my above second suggestion: We could hang a map accessible to all CPs
in the RS off of RegionServerServices as like was done for the region level in HBASE-6505.
Then the client would provide the (large) state as an attribute on only the first mutation
sent to each regionserver. (More on this below.) The CP would observe the attribute and apply
it to RS-level shared state. Then the mutation and subsequent mutations could be processed
referring to the updated RS-level state. 

The client side is tricky.

bq. We could try to figure out which Put is the "first one" for each region, but what if a
split occurs after we figure this out – this seems too brittle.

If we introduce a new client API to the effect of "send one RPC to each RS", then this amounts
to a modified coprocessor endpoint execution, but with an invocation target that is a singleton
to each RS, and should be subject to the same security considerations. Passing an attribute
on the first put to a RS sidesteps the need for EXEC grants (HBASE-6104) on any endpoint invocation
target, which is what sounds like the goal you are after.

Whether an endpoint invocation or a mutation, we have the same issue that the local knowledge
of cluster state can at any point be stale. Live servers can come and go, and regions can
move around, and there is no transactional state update protocol running between clients and
servers for updating this information. Even if there were, cluster topology can change mid
flight. A "send one RPC to each RS" API could miss a newly onlined server that came up after
the call(s) started and yet opened some relevant regions asynchronously. 

Whether trying to figure out which put is the first for a RS, or selecting keys for a set
of coprocessor endpoints such that you only invoke one per RS, or using a new "send one RPC
to each RS", on the server you'd have to handle the same set of issues, right? There could
be 0, 1, or ~2 large data transfers per RS:
- 0 if a new server is onlined and regions are assigned after the put or "send one RPC to
each RS" calls are in progress
- 1 if the cluster topology is unchanged over the entire client action
- ~2 if a region is moved or split, or even in the case of one-RPC-per-server if there is
a RPC retry on account of the failed transmission back to the client of a server side success

I wouldn't use the word brittle. "Messy" is better. It always is.

> Enable client to setAttribute that is sent once to each region server
> ---------------------------------------------------------------------
>                 Key: HBASE-9291
>                 URL: https://issues.apache.org/jira/browse/HBASE-9291
>             Project: HBase
>          Issue Type: New Feature
>          Components: IPC/RPC
>            Reporter: James Taylor
> Currently a Scan and Mutation allow the client to set its own attributes that get passed
through the RPC layer and are accessible from a coprocessor. This is very handy, but breaks
down if the amount of information is large, since this information ends up being sent again
and again to every region. Clients can work around this with an endpoint "pre" and "post"
coprocessor invocation that:
> 1) sends the information and caches it on the region server in the "pre" invocation
> 2) invokes the Scan or sends the batch of Mutations, and then
> 3) removes it in the "post" invocation.
> In this case, the client is forced to identify all region servers (ideally, all region
servers that will be involved in the Scan/Mutation), make extra RPC calls, manage the caching
of the information on the region server, age-out the information (in case the client dies
before step (3) that clears the cached information), and must deal with the possibility of
a split occurring while this operation is in-progress.
> Instead, it'd be much better if an attribute could be identified as a "region server"
attribute in OperationWithAttributes and the HBase RPC layer would take care of doing the
> The use case where the above are necessary in Phoenix include:
> 1) Hash joins, where the results of the smaller side of a join scan are packaged up and
sent to each region server, and
> 2) Secondary indexing, where the metadata of knowing a) which column family/column qualifier
pairs and b) which part of the row key contributes to which indexes are sent to each region
server that will process a batched put.

This message was sent by Atlassian JIRA

View raw message