kudu-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mpe...@apache.org
Subject kudu git commit: tablet_copy-itest: Reduce flakiness
Date Sat, 29 Apr 2017 19:38:22 GMT
Repository: kudu
Updated Branches:
  refs/heads/master 2a1aeb642 -> 948770425


tablet_copy-itest: Reduce flakiness

This patch reduces the flakiness under heavy CPU load of
TabletCopyITest.TestDisableTabletCopy_NoTightLoopWhenTabletDeleted from
37/100 failures [1] to 0/100 failures [2] on dist-test by simply tuning
up the # of log messages per second we tolerate from 30 to 60.

This patch also fixes a minor bug where we attempt to send a tablet copy
request with invalid parameters even if the PrepareTabletCopyRequest()
method returns an error, indicating that it wasn't able to fill in the
request. This resulted in warning messages in the logs that looked like
the following:

[libprotobuf ERROR /home/mpercy/src/kudu/thirdparty/src/protobuf-2.6.1/src/google/protobuf/message_lite.cc:123]
Can't parse message of type "kudu.consensus.StartTabletCopyRequestPB" because it is missing
required fields: tablet_id, copy_peer_uuid, copy_peer_addr
W0428 19:46:21.691408 10560 service_if.cc:62] invalid parameter for call kudu.consensus.ConsensusService.StartTabletCopy:
missing fields: tablet_id, copy_peer_uuid, copy_peer_addr

[1] http://dist-test.cloudera.org/job?job_id=mpercy.1493408511.5800
[2] http://dist-test.cloudera.org/job?job_id=mpercy.1493408241.3403

Change-Id: Idd10f2fefb67634031f5c08e2adddc695193afb7
Reviewed-on: http://gerrit.cloudera.org:8080/6764
Reviewed-by: Adar Dembo <adar@cloudera.com>
Tested-by: Kudu Jenkins


Project: http://git-wip-us.apache.org/repos/asf/kudu/repo
Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/94877042
Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/94877042
Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/94877042

Branch: refs/heads/master
Commit: 948770425bc66a0a86d34b129c7d5fe1fdf86eda
Parents: 2a1aeb6
Author: Mike Percy <mpercy@apache.org>
Authored: Fri Apr 28 10:23:47 2017 -0700
Committer: Mike Percy <mpercy@apache.org>
Committed: Sat Apr 29 19:38:04 2017 +0000

----------------------------------------------------------------------
 src/kudu/consensus/consensus_peers.cc           | 22 ++++++++++----------
 src/kudu/consensus/consensus_queue.cc           |  5 +++--
 src/kudu/integration-tests/tablet_copy-itest.cc |  4 ++--
 3 files changed, 16 insertions(+), 15 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/kudu/blob/94877042/src/kudu/consensus/consensus_peers.cc
----------------------------------------------------------------------
diff --git a/src/kudu/consensus/consensus_peers.cc b/src/kudu/consensus/consensus_peers.cc
index ce69ee6..26bbe06 100644
--- a/src/kudu/consensus/consensus_peers.cc
+++ b/src/kudu/consensus/consensus_peers.cc
@@ -189,20 +189,20 @@ void Peer::SendNextRequest(bool even_if_queue_empty) {
 
   if (PREDICT_FALSE(needs_tablet_copy)) {
     Status s = PrepareTabletCopyRequest();
-    if (!s.ok()) {
+    if (s.ok()) {
+      controller_.Reset();
+      request_pending_ = true;
+      l.unlock();
+      // Capture a shared_ptr reference into the RPC callback so that we're guaranteed
+      // that this object outlives the RPC.
+      proxy_->StartTabletCopy(&tc_request_, &tc_response_, &controller_,
+                              [s_this = shared_from_this()]() {
+                                s_this->ProcessTabletCopyResponse();
+                              });
+    } else {
       LOG_WITH_PREFIX_UNLOCKED(WARNING) << "Unable to generate Tablet Copy request
for peer: "
                                         << s.ToString();
     }
-
-    controller_.Reset();
-    request_pending_ = true;
-    l.unlock();
-    // Capture a shared_ptr reference into the RPC callback so that we're guaranteed
-    // that this object outlives the RPC.
-    proxy_->StartTabletCopy(&tc_request_, &tc_response_, &controller_,
-                            [s_this = shared_from_this()]() {
-                              s_this->ProcessTabletCopyResponse();
-                            });
     return;
   }
 

http://git-wip-us.apache.org/repos/asf/kudu/blob/94877042/src/kudu/consensus/consensus_queue.cc
----------------------------------------------------------------------
diff --git a/src/kudu/consensus/consensus_queue.cc b/src/kudu/consensus/consensus_queue.cc
index 5f54b88..e752eb2 100644
--- a/src/kudu/consensus/consensus_queue.cc
+++ b/src/kudu/consensus/consensus_queue.cc
@@ -489,7 +489,7 @@ Status PeerMessageQueue::RequestForPeer(const string& uuid,
 }
 
 Status PeerMessageQueue::GetTabletCopyRequestForPeer(const string& uuid,
-                                                          StartTabletCopyRequestPB* req)
{
+                                                     StartTabletCopyRequestPB* req) {
   TrackedPeer* peer = nullptr;
   int64_t current_term;
   {
@@ -645,7 +645,8 @@ void PeerMessageQueue::ResponseFromPeer(const std::string& peer_uuid,
 
       peer->needs_tablet_copy = true;
       VLOG_WITH_PREFIX_UNLOCKED(1) << "Marked peer as needing tablet copy: "
-                                     << peer->ToString();
+                                   << peer->ToString();
+
       *more_pending = true;
       return;
     }

http://git-wip-us.apache.org/repos/asf/kudu/blob/94877042/src/kudu/integration-tests/tablet_copy-itest.cc
----------------------------------------------------------------------
diff --git a/src/kudu/integration-tests/tablet_copy-itest.cc b/src/kudu/integration-tests/tablet_copy-itest.cc
index 41980db..9a6f51d 100644
--- a/src/kudu/integration-tests/tablet_copy-itest.cc
+++ b/src/kudu/integration-tests/tablet_copy-itest.cc
@@ -725,9 +725,9 @@ TEST_F(TabletCopyITest, TestDisableTabletCopy_NoTightLoopWhenTabletDeleted)
{
   int64_t update_rpcs_per_second =
       (num_update_rpcs_after_sleep - num_update_rpcs_initial) / elapsed.ToSeconds();
   EXPECT_LT(update_rpcs_per_second, 20);
-  int64_t num_logs_per_second =
+  double num_logs_per_second =
       (num_logs_after_sleep - num_logs_initial) / elapsed.ToSeconds();
-  EXPECT_LT(num_logs_per_second, 30); // We might occasionally get unrelated log messages.
+  EXPECT_LT(num_logs_per_second, 60); // We might occasionally get unrelated log messages.
 }
 
 // Test that if a Tablet Copy is taking a long time but the client peer is still responsive,


Mime
View raw message