hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Bapat (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-21893) Handle concurrent write + drop when ACID tables are getting bootstrapped.
Date Fri, 05 Jul 2019 10:36:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-21893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879131#comment-16879131
] 

Ashutosh Bapat commented on HIVE-21893:
---------------------------------------

[~sankarh],  these two issues can happen even in case of normal bootstrap for a new policy,
not just in case of the one during incremental phase. But anyway here’s my analysis of problematic
cases.

The key point here is following comment in org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask#getValidTxnListForReplDump()

 
{code:java}
// Key design point for REPL DUMP is to not have any txns older than current txn in which
// dump runs. This is needed to ensure that Repl dump doesn't copy any data files written
by
// any open txns mainly for streaming ingest case where one delta file shall have data from
// committed/aborted/open txns. It may also have data inconsistency if the on-going txns
// doesn't have corresponding open/write events captured which means, catch-up incremental
// phase won't be able to replicate those txns. So, the logic is to wait for the given amount
// of time to see if all open txns < current txn is getting aborted/committed. If not,
then
// we forcefully abort those txns just like AcidHouseKeeperService.{code}
 

 Case 1
{quote}If Step-11 happens between Step-1 and Step-2. Also, Step-13 completes before we forcefully
abort Tx2 from REPL DUMP thread T1. Also, assume Step-14 is done after bootstrap is completed.
In this case, bootstrap would replicate the data/writeId written by Tx2. But, the next incremental
cycle would also replicate the open_txn, allocate_writeid and commit_txn events which would
duplicate the data.
{quote}
If step-11 happens between step-1 and step-2 that itself can cause multiple problems as the
open transaction event is replayed twice (once during bootstrap and once during next incremental),
thus causing writeIds on target going out of sync with the source. A better solution would
be to combine setLastReplIdForDump() and openTransaction() in Driver.compile() for REPL DUMP
case. We should let openTransaction() return the eventId of the open transaction event of
the REPL DUMP. This eventId would be set as the lastReplIdForDump(). The next incremental
dump will start from the events following this open transaction event.

With that we will prohibit step 11 from happening between step 1 and step 2. So step-11 can
happen either after step 2 or before 1.
 # If it happens after 2, it will not be recorded in the snapshot of DUMP and thus changes
within that transaction will not be replicated during bootstrap. The next incremental will
replicate the events.

 # If step-11 happens before step-1 and commits before we start the dump, the changes by it
will be replicated during bootstrap since that transaction will be considered as visible to
the REPL DUMP transaction. If alloc_writeId event is idempotent for a given transaction on
source, once the open transaction event has been replicated as part of bootstrap, same writeId
will be allocated however times the alloc_writeId event is replicated, thus keeping the writeIds
on source and target in sync. Any files written will be marked with the same writeId, so copying
them multiple times will not duplicate data. So there’s not correctness issue there in this
case either.

case 2
{quote}If Step-11 to Step-14 in Thread T2 happens after Step-1 in REPL DUMP thread T1. In
this case, table is not bootstrapped but the corresponding open_txn, allocate_writeid, commit_txn
and drop events would be replicated in next cycle. During next cycle, REPL LOAD would fail
on commitTxn event as table is dropped or event is missing.
{quote}
If step-11 to step 14 happen before step-1, those will be covered by bootstrap itself and
they will not appear in the incremental. I think you wanted to say that step 14 happens before
step 4 thus the table is not bootstrapped, but any event after open transaction are part of
next incremental.

This case is covered by test org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTables#testAcidTablesBootstrapWithConcurrentDropTable().

In this case, the ALTER TABLE events created by INSERT operation are converted to CreateTable
on target and thus at the time of commit it sees the table, which is dropped by subsequent
drop event. So, no correctness issue here as well.

> Handle concurrent write + drop when ACID tables are getting bootstrapped.
> -------------------------------------------------------------------------
>
>                 Key: HIVE-21893
>                 URL: https://issues.apache.org/jira/browse/HIVE-21893
>             Project: Hive
>          Issue Type: Bug
>          Components: repl
>    Affects Versions: 4.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Ashutosh Bapat
>            Priority: Major
>              Labels: DR, Replication
>
> ACID tables will be bootstrapped during incremental phase in couple of cases. 
> 1. hive.repl.bootstrap.acid.tables is set to true in WITH clause of REPL DUMP.
> 2. If replication policy is changed using REPLACE clause in REPL DUMP where the ACID
table is matching new policy but not old policy.
> REPL DUMP performed below sequence of operations. Let's say Thread (T1)
> 1. Get Last Repl ID (lastId)
> 2. Open Transaction (Tx1)
> 3. Dump events until lastId.
> 4. Get the list of tables in the given DB.
> 5. If table matches current policy, then bootstrap dump it.
> Let's say, concurrently another thread  (let's say T2) is running as follows.
> 11. Open Transaction (Tx2).
> 12. Insert into ACID table Tbl1.
> 13. Commit Transaction (Tx2)
> 14. Drop table (Tbl1) --> Not necessarily same thread, may be from different thread
as well.
> *Problematic Use-cases:*
> 1. If Step-11 happens between Step-1 and Step-2. Also, Step-13 completes before we forcefully
abort Tx2 from REPL DUMP thread T1. Also, assume Step-14 is done after bootstrap is completed.
In this case, bootstrap would replicate the data/writeId written by Tx2. But, the next incremental
cycle would also replicate the open_txn, allocate_writeid and commit_txn events which would
duplicate the data.
> 2. If Step-11 to Step-14 in Thread T2 happens after Step-1 in REPL DUMP thread T1. In
this case, table is not bootstrapped but the corresponding open_txn, allocate_writeid, commit_txn
and drop events would be replicated in next cycle. During next cycle, REPL LOAD would fail
on commmitTxn event as table is dropped or event is missing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message