hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matteo Bertozzi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16507) Procedure v2 - Force DDL operation to always roll forward
Date Thu, 01 Sep 2016 16:58:20 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15455992#comment-15455992
] 

Matteo Bertozzi commented on HBASE-16507:
-----------------------------------------

The all point of the proc-v2 work is to remove conflict situation via a single writer/coordinator.

In this ddl case the rollback is not really removed for technical difficulties. Aside from
the fact that some implementation like enable/disable/modify are wrong and end up causing
possible corruptions. The point here is that DDLs are operations that will always be able
to complete. There may be transient failure along the way (e.g zk/hdfa/meta hiccups) but as
soon those are resolved and the master is fully operational again the operation will be able
to complete.

At the moment we rollback on transient failures which will result in users being confused
about why the operation failed and a simple retry will succeed. Ending up in support calls
and request about why the system can't handle it since it was just a transient failure.

The RS puts behaves with a roll forward approach like we have in this patch. Once the operation
is written to the wal we know that the operation will complete  (at some point) no matter
how many transient failure we will get.

Rollback as in traditional database are triggered by situation where the operation is stuck
due to conflicts and it will never be able to complete. In this case our DDLs will always
be able to complete.

The only case that rollback can be applied to DDLs here is when the user request an abort
of the operation. But DDLs are too short in duration to have the user be able to abort them
when started. Note that in the patch attached the user is able to rollback the DDLs operation
up to the first prepare step. Which is the most likely case where you want to abort. E.g.
Your operation is stuck behind another operation that is making slow progress and you decide
to abort it. 

long operation like assignment, snapshot/backups will have rollback support. because they
are long and because for "technical difficulties" may end up stuck in conflicts.

> Procedure v2 - Force DDL operation to always roll forward
> ---------------------------------------------------------
>
>                 Key: HBASE-16507
>                 URL: https://issues.apache.org/jira/browse/HBASE-16507
>             Project: HBase
>          Issue Type: Sub-task
>          Components: master, proc-v2
>    Affects Versions: 2.0.0
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>             Fix For: 2.0.0
>
>         Attachments: HBASE-16507-v0.patch, HBASE-16507-v1.patch
>
>
> Having rollback for DDLs was a bad idea. 
> and it turns out to be an unexpected behavior for the user. 
> DDLs only have transient errors (e.g. zk, hdfs, meta down)
> if we abort/rollback on a transient failure the user will get a failure,
> and it is not clear why the user needs to retry the command when the system can do that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message