phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4130) Avoid server retries for mutable indexes
Date Thu, 07 Dec 2017 01:59:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281234#comment-16281234
] 

James Taylor commented on PHOENIX-4130:
---------------------------------------

Thanks for the patch, [~vincentpoon]. I like this approach. A few comments/questions:
- Once the index is put in a PENDING_ACTIVE state, who puts it back into an ACTIVE state if
the client retries succeed? Are you counting on the partial index rebuilder in MetaDataRegionObserver
to do that? The downside of that is that we'd replay some edits that have already been processed,
but if we expect this to be rare perhaps that's ok. An alternative would be for Phoenix to
handle the client retries and set the index status back to ACTIVE if they succeed. I'm not
positive, but we may be able to detect from the client that retries were done and then we
could potentially set the index back to ACTIVE and clear the INDEX_DISABLE_TIMESTAMP (FYI,
see below for a util function that does this).
- One good side effect of putting the index to a PENDING_ACTIVE state is that if the client
were to crash during the retries, the partial index rebuilder would do the replay and eventual
disable if the replay took too long. If this were to happen, though, do you think the lag
time is too long to continue to use the index even though it's out of sync with the client
(up to 45 minutes while previously it would have been ~15 seconds)? Another (though more complex)
alternative would be to invent another state like PENDING_DISABLE which would act the same
as PENDING_ACTIVE except on the client we could treat the index as disabled if the index stays
in this state longer than a configurable amount of time (i.e. the same amount of time we were
willing to retry on the server side). This would be a much shorter window in this case than
the default 45 minutes of keeping the index active until the partial index rebuilder gives
up and marks it as permanently disabled.
- One corner case that's not handled is if we're updating indexes within the RS. This can
occur as an optimization when auto commit is on and a DML statement is operating on a single
table (i.e. DELETE FROM T). In this case, the commits occur in UngroupedAggregateRegionObserver.commit(),
unfortunately outside of the abstraction we have in MutationState.
- We'll want to document that the QueryServices.INDEX_FAILURE_DISABLE_INDEX config property
needs to be set both on the client and server side as we're relying on that to determine both
the client behaviour and the server behaviour (where before it was only used on the server
side). Not a huge deal, but if we can prevent needing to check it client side (maybe based
on information thrown back in the exception?), then that'd be an improvement IMHO.
- You might want to consider using the IndexUtil.updateIndexState(conn, fullIndexTableName,
PIndexState.DISABLE, 0L) method to disable the index instead of using the one you wrote. If
you use yours, then you'd want to make sure to put the schema name, table name, and index
name in double quotes in case they're case sensitive (as otherwise you'd get a TableNotFoundException).
{code}
+    private void disableIndex(String dataTableFullName, String indexName) throws SQLException
{
+        logger.info(
+            "Disabling index after hitting max number of index write retries: " + indexName);
+        String disableIndexDDL = "ALTER INDEX %s ON %s DISABLE";
+        connection.createStatement().execute(String.format(disableIndexDDL,
+            SchemaUtil.getTableNameFromFullName(indexName), dataTableFullName));
+    }
{code}

> Avoid server retries for mutable indexes
> ----------------------------------------
>
>                 Key: PHOENIX-4130
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4130
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Lars Hofhansl
>            Assignee: Vincent Poon
>             Fix For: 4.14.0
>
>         Attachments: PHOENIX-4130.v1.master.patch
>
>
> Had some discussions with [~jamestaylor], [~samarthjain], and [~vincentpoon], during
which I suggested that we can possibly eliminate retry loops happening at the server that
cause the handler threads to be stuck potentially for quite a while (at least multiple seconds
to ride over common scenarios like splits).
> Instead we can do the retries at the Phoenix client that.
> So:
> # The index updates are not retried on the server. (retries = 0)
> # A failed index update would set the failed index timestamp but leave the index enabled.
> # Now the handler thread is done, it throws an appropriate exception back to the client.
> # The Phoenix client can now retry. When those retries fail the index is disabled (if
the policy dictates that) and throw the exception back to its caller.
> So no more waiting is needed on the server, handler threads are freed immediately.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message