phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-4619) Process transactional updates to local index on server-side
Date Mon, 19 Feb 2018 21:12:00 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-4619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

James Taylor updated PHOENIX-4619:
----------------------------------
    Description: 
For local indexes, we'll want to continue to process updates on the server-side. After PHOENIX-4278,
updates even for local indexes are occurring on the client-side. The reason is that we know
the updates to the index table will be a local write and we can generate the write on the
server side. Having a separate RPC and sending the updates across the wire would be tremendously
inefficient. On top of that, we need the region boundary information which we have already
in the coprocessor, but would need to retrieve it on the client side (with a likely race condition
too if a split occurs after we retrieve it).

To fix this, we need to modify PhoenixTxnIndexMutationGenerator such that it can be use on
the server-side as well. The main change will be to change this method signature to pass through
an IndexMaintainer instead of a PTable (which isn't available on the server-side):
{code}
    public List<Mutation> getIndexUpdates(final PTable table, PTable index, List<Mutation>
dataMutations) throws IOException, SQLException {
{code}
I think this can be changed to the following instead and be used both client and server side:
{code}
    public List<Mutation> getIndexUpdates(final IndexMaintainer maintainer, byte[] dataTableName,
List<Mutation> dataMutations) throws IOException, SQLException {
{code}

We can tweak the code that makes PhoenixTransactionalIndexer a noop for clients >= 4.14
to have it execute if the index is a local index. The one downside is that if there's a mix
of local and global indexes on the same table, the index update calculation will be done twice.
I think having a mix of index types would be rare, though, and we should advise against it.

There's also this code in UngroupedAggregateRegionObserver which needs to be updated to write
shadow cells for Omid:
{code}
                        } else if (buildLocalIndex) {
                            for (IndexMaintainer maintainer : indexMaintainers) {
                                if (!results.isEmpty()) {
                                    result.getKey(ptr);
                                    ValueGetter valueGetter =
                                            maintainer.createGetterFromKeyValues(
                                                ImmutableBytesPtr.copyBytesIfNecessary(ptr),
                                                results);
                                    Put put = maintainer.buildUpdateMutation(kvBuilder,
                                        valueGetter, ptr, results.get(0).getTimestamp(),
                                        env.getRegion().getRegionInfo().getStartKey(),
                                        env.getRegion().getRegionInfo().getEndKey());
                                    indexMutations.add(put);
                                }
                            }
                            result.setKeyValues(results);
{code}
This is the code that builds a local index initially (unlike the global index code path which
executes an UPSERT SELECT on the client side to do this initial population).

  was:
For local indexes, we'll want to continue to process updates on the server-side. After PHOENIX-4278,
updates even for local indexes are occurring on the client-side. The reason is that we know
the updates to the index table will be a local write and we can generate the write on the
server side. Having a separate RPC and sending the updates across the wire would be tremendously
inefficient. On top of that, we need the region boundary information which we have already
in the coprocessor, but would need to retrieve it on the client side (with a likely race condition
too if a split occurs after we retrieve it).

To fix this, we need to modify PhoenixTxnIndexMutationGenerator such that it can be use on
the server-side as well. The main change will be to change this method signature to pass through
an IndexMaintainer instead of a PTable (which isn't available on the server-side):
{code}
    public List<Mutation> getIndexUpdates(final PTable table, PTable index, List<Mutation>
dataMutations) throws IOException, SQLException {
{code}
I think this can be changed to the following instead and be used both client and server side:
{code}
    public List<Mutation> getIndexUpdates(final IndexMaintainer maintainer, byte[] dataTableName,
List<Mutation> dataMutations) throws IOException, SQLException {
{code}

We can tweak the code that makes PhoenixTransactionalIndexer a noop for clients >= 4.14
to have it execute if the index is a local index. The one downside is that if there's a mix
of local and global indexes on the same table, the index update calculation will be done twice.
I think having a mix of index types would be rare, though, and we should advise against it.


> Process transactional updates to local index on server-side
> -----------------------------------------------------------
>
>                 Key: PHOENIX-4619
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4619
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>            Priority: Major
>
> For local indexes, we'll want to continue to process updates on the server-side. After
PHOENIX-4278, updates even for local indexes are occurring on the client-side. The reason
is that we know the updates to the index table will be a local write and we can generate the
write on the server side. Having a separate RPC and sending the updates across the wire would
be tremendously inefficient. On top of that, we need the region boundary information which
we have already in the coprocessor, but would need to retrieve it on the client side (with
a likely race condition too if a split occurs after we retrieve it).
> To fix this, we need to modify PhoenixTxnIndexMutationGenerator such that it can be use
on the server-side as well. The main change will be to change this method signature to pass
through an IndexMaintainer instead of a PTable (which isn't available on the server-side):
> {code}
>     public List<Mutation> getIndexUpdates(final PTable table, PTable index, List<Mutation>
dataMutations) throws IOException, SQLException {
> {code}
> I think this can be changed to the following instead and be used both client and server
side:
> {code}
>     public List<Mutation> getIndexUpdates(final IndexMaintainer maintainer, byte[]
dataTableName, List<Mutation> dataMutations) throws IOException, SQLException {
> {code}
> We can tweak the code that makes PhoenixTransactionalIndexer a noop for clients >=
4.14 to have it execute if the index is a local index. The one downside is that if there's
a mix of local and global indexes on the same table, the index update calculation will be
done twice. I think having a mix of index types would be rare, though, and we should advise
against it.
> There's also this code in UngroupedAggregateRegionObserver which needs to be updated
to write shadow cells for Omid:
> {code}
>                         } else if (buildLocalIndex) {
>                             for (IndexMaintainer maintainer : indexMaintainers) {
>                                 if (!results.isEmpty()) {
>                                     result.getKey(ptr);
>                                     ValueGetter valueGetter =
>                                             maintainer.createGetterFromKeyValues(
>                                                 ImmutableBytesPtr.copyBytesIfNecessary(ptr),
>                                                 results);
>                                     Put put = maintainer.buildUpdateMutation(kvBuilder,
>                                         valueGetter, ptr, results.get(0).getTimestamp(),
>                                         env.getRegion().getRegionInfo().getStartKey(),
>                                         env.getRegion().getRegionInfo().getEndKey());
>                                     indexMutations.add(put);
>                                 }
>                             }
>                             result.setKeyValues(results);
> {code}
> This is the code that builds a local index initially (unlike the global index code path
which executes an UPSERT SELECT on the client side to do this initial population).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message