kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a...@apache.org
Subject [1/2] kudu git commit: KUDU-1678: Race during abort of pending operations during raft shutdown
Date Wed, 24 Oct 2018 17:55:03 GMT
Repository: kudu
Updated Branches:
  refs/heads/master b22138783 -> fff0da840


KUDU-1678: Race during abort of pending operations during raft shutdown

When a tablet replica is shutting down, the following race can occur:

0: The replica receives an ALTER_SCHEMA op adding the column 'foo'.
1: The replica receives a WRITE_OP inserting a row with column 'foo'
   present.
2: The replica starts to abort its pending operations because it is
   shutting down. The ALTER_SCHEMA is aborted.
3: Before the WRITE_OP can be aborted, it replicates.
4: The tablet server crashes as a result, with a message like:

F1023 20:49:18.088703  1409 transaction_driver.cc:382] T 3fe5651e31d8486780d0e480f8748ead
P 12e38846d8c648df8a11b9d8da2ad407 S R-NP Ts 6309182492379934720: Cannot cancel transactions
that have already replicated: Invalid argument: Client provided column c587 INT32 NOT NULL
not present in tablet

The tablet server will crash with this same message every time it tries
to bootstrap. Other tablet servers hosting replicas may crash if they
replicated the bad write.

The problem is that transactions need to be aborted in reverse order of
their index, since later transactions may depend on earlier ones.

This bug reproduced about 1% of the time in alter_table-randomized-test
when run in DEBUG mode with 16 stress threads: I saw it 10 times in 1000
runs. With this change, I saw 0 failures in 1000 runs.

Change-Id: Idde75bd1fe966a1a3d53aa1e5de6a01a48ff1103
Reviewed-on: http://gerrit.cloudera.org:8080/11759
Reviewed-by: Alexey Serbin <aserbin@cloudera.com>
Tested-by: Will Berkeley <wdberkeley@gmail.com>


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/d1f8c23b
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/d1f8c23b
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/d1f8c23b

Branch: refs/heads/master
Commit: d1f8c23b45492c3f1f49bf8cac2ee4d7e32f116f
Parents: b221387
Author: Will Berkeley <wdberkeley@gmail.org>
Authored: Tue Oct 23 14:12:10 2018 -0700
Committer: Will Berkeley <wdberkeley@gmail.com>
Committed: Tue Oct 23 23:25:20 2018 +0000

----------------------------------------------------------------------
 src/kudu/consensus/pending_rounds.cc            |  8 +++---
 .../alter_table-randomized-test.cc              | 28 +++++++++++++-------
 2 files changed, 23 insertions(+), 13 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/d1f8c23b/src/kudu/consensus/pending_rounds.cc
----------------------------------------------------------------------
diff --git a/src/kudu/consensus/pending_rounds.cc b/src/kudu/consensus/pending_rounds.cc
index f334be8..940eb05 100644
--- a/src/kudu/consensus/pending_rounds.cc
+++ b/src/kudu/consensus/pending_rounds.cc
@@ -17,6 +17,7 @@
 
 #include "kudu/consensus/pending_rounds.h"
 
+#include <iterator>
 #include <ostream>
 #include <utility>
 
@@ -62,11 +63,12 @@ Status PendingRounds::CancelPendingTransactions() {
 
   LOG_WITH_PREFIX(INFO) << "Trying to abort " << pending_txns_.size()
                         << " pending transactions.";
-  for (const auto& txn : pending_txns_) {
-    const scoped_refptr<ConsensusRound>& round = txn.second;
+  // Abort transactions in reverse index order. See KUDU-1678.
+  for (auto txn = pending_txns_.crbegin(); txn != pending_txns_.crend(); ++txn) {
+    const scoped_refptr<ConsensusRound>& round = txn->second;
     // We cancel only transactions whose applies have not yet been triggered.
     LOG_WITH_PREFIX(INFO) << "Aborting transaction as it isn't in flight: "
-                                   << SecureShortDebugString(*txn.second->replicate_msg());
+                          << SecureShortDebugString(*round->replicate_msg());
     round->NotifyReplicationFinished(Status::Aborted("Transaction aborted"));
   }
   return Status::OK();

http://git-wip-us.apache.org/repos/asf/kudu/blob/d1f8c23b/src/kudu/integration-tests/alter_table-randomized-test.cc
----------------------------------------------------------------------
diff --git a/src/kudu/integration-tests/alter_table-randomized-test.cc b/src/kudu/integration-tests/alter_table-randomized-test.cc
index e1739e8..7b6cd78 100644
--- a/src/kudu/integration-tests/alter_table-randomized-test.cc
+++ b/src/kudu/integration-tests/alter_table-randomized-test.cc
@@ -690,17 +690,25 @@ struct MirrorTable {
   TableState ts_;
 };
 
-// Stress test for various alter table scenarios. This performs a random sequence of:
-//   - insert a row (using the latest schema)
-//   - delete a random row
-//   - update a row (all columns with the latest schema)
-//   - add a new column
-//   - drop a column
-//   - restart the tablet server
+// Stress test for various alter table scenarios. This performs a random
+// sequence of the following operations:
+//   - Restart the master
+//   - Restart a random tablet server
+//   - Alter the table
+//     - Add a column
+//     - Drop a column
+//     - Add a range partition
+//     - Drop a range partition
+//     - Rename a column (including primary key columns)
+//     - Change a column's default value
+//     - Change the encoding or compression of a column
+//   - Insert a row (using the latest schema)
+//   - Delete a random row
+//   - Update a row (all columns with the latest schema)
 //
-// During the sequence of operations, a "mirror" of the table in memory is kept up to
-// date. We periodically scan the actual table, and ensure that the data in Kudu
-// matches our in-memory "mirror".
+// During the sequence of operations, a "mirror" of the table in memory is kept
+// up to date. We periodically scan the actual table, and ensure that the data
+// in Kudu matches our in-memory "mirror".
 TEST_P(AlterTableRandomized, TestRandomSequence) {
   MirrorTable t(client_);
   ASSERT_OK(t.Create());


Mime
View raw message