hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-20967) Handle alter events when replicate to cluster with hive.strict.managed.tables enabled.
Date Wed, 08 May 2019 04:11:01 GMT

     [ https://issues.apache.org/jira/browse/HIVE-20967?focusedWorklogId=238980&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-238980
]

ASF GitHub Bot logged work on HIVE-20967:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/May/19 04:10
            Start Date: 08/May/19 04:10
    Worklog Time Spent: 10m 
      Work Description: maheshk114 commented on pull request #613: HIVE-20967
URL: https://github.com/apache/hive/pull/613#discussion_r281906966
 
 

 ##########
 File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 ##########
 @@ -805,22 +811,89 @@ public Partition alterPartition(RawStore msdb, Warehouse wh, String
catName, Str
     return oldParts;
   }
 
-  private void checkTableTypeConversion(Database db, Table oldTbl, Table newTbl)
+  // Validate changes to partition's location to protect against errors on migration during
+  // replication
+  private void blockPartitionLocationChangesOnReplSource(Database db, Table tbl,
+                                                         EnvironmentContext ec)
+          throws InvalidOperationException {
+    // If the database is not replication source, nothing to do
+    if (!ReplChangeManager.isSourceOfReplication(db)) {
+      return;
+    }
+
+    // For now validate only the changes when strict managed tables is false. That's when
there's
+    // scope for migration during replication, at least for now.
+    if (conf.getBoolean(MetastoreConf.ConfVars.STRICT_MANAGED_TABLES.getHiveName(), false))
{
+      return;
+    }
+
+    // Do not allow changing location of a managed table as as alter event doesn't capture
the
+    // new files list. So, it may cause data inconsistency.
+    boolean isChangingLocation = false;
+    if (ec.isSetProperties()) {
+      String alterType = ec.getProperties().get(ALTER_TABLE_OPERATION_TYPE);
+      if (alterType != null && alterType.equalsIgnoreCase(ALTERLOCATION)) {
+        isChangingLocation = true;
+      }
+    }
+    if (isChangingLocation &&
+            tbl.getTableType().equalsIgnoreCase(TableType.MANAGED_TABLE.name())) {
+      throw new InvalidOperationException("Cannot change location of a managed table " +
+              TableName.getQualified(tbl.getCatName(),
+                      tbl.getDbName(), tbl.getTableName()) + " as it is enabled for replication.");
+    }
+  }
+
+  // Validate changes to a table to protect against errors on migration during replication.
+  private void validateTableChangesOnReplSource(Database db, Table oldTbl, Table newTbl,
+                                                EnvironmentContext ec)
           throws InvalidOperationException {
-    // If the given DB is enabled for replication and strict managed is false, then table
type cannot be changed.
-    // This is to avoid migration scenarios which causes Managed ACID table to be converted
to external at replica.
-    // As ACID tables cannot be converted to external table and vice versa, we need to restrict
this conversion at
-    // primary as well.
-    // Currently, table type conversion is allowed only between managed and external table
types.
-    // But, to be future proof, any table type conversion is restricted on a replication
enabled DB.
-    if (!conf.getBoolean(MetastoreConf.ConfVars.STRICT_MANAGED_TABLES.getHiveName(), false)
-        && !oldTbl.getTableType().equalsIgnoreCase(newTbl.getTableType())
-        && ReplChangeManager.isSourceOfReplication(db)) {
+    // If the database is not replication source, nothing to do
+    if (!ReplChangeManager.isSourceOfReplication(db)) {
+      return;
+    }
+
+    // Do not allow changing location of a managed table as as alter event doesn't capture
the
+    // new files list. So, it may cause data inconsistency. We do this whether or not strict
+    // managed is true on the source cluster.
+    if (ec.isSetProperties()) {
+        String alterType = ec.getProperties().get(ALTER_TABLE_OPERATION_TYPE);
+        if (alterType != null && alterType.equalsIgnoreCase(ALTERLOCATION) &&
+            oldTbl.getTableType().equalsIgnoreCase(TableType.MANAGED_TABLE.name())) {
+          throw new InvalidOperationException("Cannot change location of a managed table
" +
+                  TableName.getQualified(oldTbl.getCatName(),
+                          oldTbl.getDbName(), oldTbl.getTableName()) + " as it is enabled
for replication.");
+        }
+    }
+
+    // Rest of the changes are need validation only when strict managed tables is false.
That's
 
 Review comment:
   "changes are need" to "changes need"
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 238980)
    Time Spent: 1h  (was: 50m)

> Handle alter events when replicate to cluster with hive.strict.managed.tables enabled.
> --------------------------------------------------------------------------------------
>
>                 Key: HIVE-20967
>                 URL: https://issues.apache.org/jira/browse/HIVE-20967
>             Project: Hive
>          Issue Type: Sub-task
>          Components: repl
>    Affects Versions: 4.0.0
>            Reporter: mahesh kumar behera
>            Assignee: Ashutosh Bapat
>            Priority: Minor
>              Labels: DR, pull-request-available
>         Attachments: HIVE-20967.01.patch, HIVE-20967.03.patch, HIVE-20967.03.patch, HIVE-20967.04.patch,
HIVE-21678.02.patch
>
>          Time Spent: 1h
>  Remaining Estimate: 0h
>
> Some of the events from Hive2 may cause conflicts in Hive3 (hive.strict.managed.tables=true)
when applied. So, need to handle them properly.
>  1. Alter table to convert non-acid to acid.
>  - Do not allow this conversion on source of replication if strict.managed is false.
> 2. Alter table or partition that changes the location.
>  - For managed tables at source, the table location shouldn't be changed for the given
non-partitioned table and partition location shouldn't be changed for partitioned table as
alter event doesn't capture the new files list. So, it may cause data inconsistsency. So,
if database is enabled for replication at source, then alter location on managed tables should
be blocked.
>  - For external partitioned tables, if location is changed at source, the the location
should be changed for the table and any partitions which reside within the table location,
but not for the partitions which are not within the table location. (may be we just need the
test).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message