Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 40CDF200B9F for ; Tue, 11 Oct 2016 18:00:01 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 3F99F160AF4; Tue, 11 Oct 2016 16:00:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8887B160AC3 for ; Tue, 11 Oct 2016 18:00:00 +0200 (CEST) Received: (qmail 6987 invoked by uid 500); 11 Oct 2016 15:59:59 -0000 Mailing-List: contact commits-help@kudu.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kudu.apache.org Delivered-To: mailing list commits@kudu.apache.org Received: (qmail 6974 invoked by uid 99); 11 Oct 2016 15:59:59 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 11 Oct 2016 15:59:59 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 9FD83E0209; Tue, 11 Oct 2016 15:59:59 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: todd@apache.org To: commits@kudu.apache.org Date: Tue, 11 Oct 2016 16:00:00 -0000 Message-Id: <95066a3967544452b6c9a56af0677399@git.apache.org> In-Reply-To: <9653468415034a9781ae6518d411020a@git.apache.org> References: <9653468415034a9781ae6518d411020a@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [2/3] kudu git commit: [flaky tests] Address SIGSEGV on ResultTracker while running alter_table_randomized-test archived-at: Tue, 11 Oct 2016 16:00:01 -0000 [flaky tests] Address SIGSEGV on ResultTracker while running alter_table_randomized-test This addresses a flakyness in the alter_table_randomized-test whereby calling ResultTracker::FailAndRespond() on a transaction that was never tracked (a legal and possible thing for follower transactions) would cause a SIGSEGV. The SIGSEGV is caused because we try to deference the CompletionRecord* to track its memory, even though we already know it doesn't exist. This is hard to reproduce in either exactly once tests, but the change is pretty obvious. I also added some comment on how that can happen, it took me a while to find the path that could cause this (again). Change-Id: I961af334fa2dd7faff0e95c7a49f2f16b2096fe0 Reviewed-on: http://gerrit.cloudera.org:8080/4629 Reviewed-by: Dan Burkert Tested-by: David Ribeiro Alves Reviewed-by: Todd Lipcon Project: http://git-wip-us.apache.org/repos/asf/kudu/repo Commit: http://git-wip-us.apache.org/repos/asf/kudu/commit/8f853d7a Tree: http://git-wip-us.apache.org/repos/asf/kudu/tree/8f853d7a Diff: http://git-wip-us.apache.org/repos/asf/kudu/diff/8f853d7a Branch: refs/heads/master Commit: 8f853d7a3ac56ac884c67b37744d0eeb237c69ef Parents: 5962de1 Author: David Alves Authored: Wed Oct 5 00:00:50 2016 -0700 Committer: Todd Lipcon Committed: Tue Oct 11 05:16:55 2016 +0000 ---------------------------------------------------------------------- src/kudu/rpc/result_tracker.cc | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/kudu/blob/8f853d7a/src/kudu/rpc/result_tracker.cc ---------------------------------------------------------------------- diff --git a/src/kudu/rpc/result_tracker.cc b/src/kudu/rpc/result_tracker.cc index 259d12a..2638492 100644 --- a/src/kudu/rpc/result_tracker.cc +++ b/src/kudu/rpc/result_tracker.cc @@ -347,12 +347,16 @@ void ResultTracker::FailAndRespondInternal(const RequestIdPB& request_id, } CompletionRecord* completion_record = state_and_record.second; - ScopedMemTrackerUpdater cr_updater(mem_tracker_.get(), completion_record); + // It is possible for this method to be called for an RPC that was never actually tracked (though + // RecordCompletionAndRespond() can't). One such case is when a follower transaction fails + // on the TransactionManager, for some reason, before it was tracked. The CompletionCallback still + // calls this method. In this case, do nothing. if (completion_record == nullptr) { return; } + ScopedMemTrackerUpdater cr_updater(mem_tracker_.get(), completion_record); completion_record->last_updated = MonoTime::Now(); int64_t seq_no = request_id.seq_no();