kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From a...@apache.org
Subject kudu git commit: raft_consensus-itest: don't assume SIGSTOP is synchronous
Date Tue, 01 Aug 2017 23:05:07 GMT
Repository: kudu
Updated Branches:
  refs/heads/master 7e0a56b2a -> 443a3512e


raft_consensus-itest: don't assume SIGSTOP is synchronous

While looping raft_consensus-itest in slow mode 1000 times, I saw the
following failure once:

  I0801 03:05:02.165644 24426 raft_consensus-itest.cc:1682] Pausing 2 tablet servers in config
of size 3
  /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:1702: Failure
  Value of: s.IsTimedOut()
    Actual: false
  Expected: true
  OK
  /data/1/adar/kudu/src/kudu/integration-tests/raft_consensus-itest.cc:1814: Failure
  Expected: AssertMajorityRequiredForElectionsAndWrites(active_tablet_servers, leader_uuid)
doesn't generate new fatal failures in the current thread.
    Actual: it does.

One explanation is: since the SIGSTOP sent by Pause() is delivered
asynchronously, it's possible for the write issued by the test (after
Pause()) to be handled before a majority of replicas are actually paused.

Change-Id: Id3202378a0e03a1bb29f32993498399e67b584d5
Reviewed-on: http://gerrit.cloudera.org:8080/7557
Reviewed-by: Todd Lipcon <todd@apache.org>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/443a3512
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/443a3512
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/443a3512

Branch: refs/heads/master
Commit: 443a3512eead1632b728e35bae1784888f6a5f6d
Parents: 7e0a56b
Author: Adar Dembo <adar@cloudera.com>
Authored: Tue Aug 1 15:14:32 2017 -0700
Committer: Adar Dembo <adar@cloudera.com>
Committed: Tue Aug 1 23:04:35 2017 +0000

----------------------------------------------------------------------
 src/kudu/integration-tests/raft_consensus-itest.cc | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/443a3512/src/kudu/integration-tests/raft_consensus-itest.cc
----------------------------------------------------------------------
diff --git a/src/kudu/integration-tests/raft_consensus-itest.cc b/src/kudu/integration-tests/raft_consensus-itest.cc
index 204f44e..65684df 100644
--- a/src/kudu/integration-tests/raft_consensus-itest.cc
+++ b/src/kudu/integration-tests/raft_consensus-itest.cc
@@ -1696,10 +1696,15 @@ void RaftConsensusITest::AssertMajorityRequiredForElectionsAndWrites(
     }
 
     // Ensure writes timeout while only a minority is alive.
-    Status s = WriteSimpleTestRow(initial_leader, tablet_id_, RowOperationsPB::UPDATE,
-                                  kTestRowKey, kTestRowIntVal, "foo",
-                                  MonoDelta::FromMilliseconds(100));
-    ASSERT_TRUE(s.IsTimedOut()) << s.ToString();
+    //
+    // The SIGSTOP issued by Pause() is delivered asynchronously, so we may
+    // need to retry this a few times to see the timeout.
+    ASSERT_EVENTUALLY([&]{
+      Status s = WriteSimpleTestRow(initial_leader, tablet_id_, RowOperationsPB::UPDATE,
+                                    kTestRowKey, kTestRowIntVal, "foo",
+                                    MonoDelta::FromMilliseconds(100));
+      ASSERT_TRUE(s.IsTimedOut()) << s.ToString();
+    });
 
     // Step down.
     ASSERT_OK(LeaderStepDown(initial_leader, tablet_id_, MonoDelta::FromSeconds(10)));
@@ -1707,7 +1712,7 @@ void RaftConsensusITest::AssertMajorityRequiredForElectionsAndWrites(
     // Assert that elections time out without a live majority.
     // We specify a very short timeout here to keep the tests fast.
     ASSERT_OK(StartElection(initial_leader, tablet_id_, MonoDelta::FromSeconds(10)));
-    s = WaitUntilLeader(initial_leader, tablet_id_, MonoDelta::FromMilliseconds(100));
+    Status s = WaitUntilLeader(initial_leader, tablet_id_, MonoDelta::FromMilliseconds(100));
     ASSERT_TRUE(s.IsTimedOut()) << s.ToString();
     LOG(INFO) << "Expected timeout encountered on election with weakened config: "
<< s.ToString();
 


Mime
View raw message