Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6D69D1826C for ; Fri, 29 May 2015 20:06:29 +0000 (UTC) Received: (qmail 74973 invoked by uid 500); 29 May 2015 20:06:29 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 74939 invoked by uid 500); 29 May 2015 20:06:29 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 74922 invoked by uid 99); 29 May 2015 20:06:29 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 May 2015 20:06:29 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 5EC501DDB6C; Fri, 29 May 2015 20:06:27 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============7364910032985952002==" MIME-Version: 1.0 Subject: Re: Review Request 34821: Occasional database deadlock detected when provisioning cluster via blueprint api From: "John Speidel" To: "Sumit Mohanty" , "Sid Wagle" , "Robert Nettleton" , "Tom Beerbower" Cc: "John Speidel" , "Ambari" Date: Fri, 29 May 2015 20:06:27 -0000 Message-ID: <20150529200627.8664.79231@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: "John Speidel" X-ReviewGroup: Ambari X-ReviewRequest-URL: https://reviews.apache.org/r/34821/ X-Sender: "John Speidel" References: <20150529190048.8664.37577@reviews.apache.org> In-Reply-To: <20150529190048.8664.37577@reviews.apache.org> Reply-To: "John Speidel" X-ReviewRequest-Repository: ambari --===============7364910032985952002== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/34821/ ----------------------------------------------------------- (Updated May 29, 2015, 8:06 p.m.) Review request for Ambari, Robert Nettleton, Sumit Mohanty, Sid Wagle, and Tom Beerbower. Bugs: AMBARI-11542 https://issues.apache.org/jira/browse/AMBARI-11542 Repository: ambari Description ------- When provisioning a cluster via the blueprint api, occasionally a database deadlock is detected. There is retry logic around this code so it doesn't affect the creation of the cluster and a user wouldn't notice this unless they looked at the logs. That being said, this issue involves incorrect transaction demarcation and synchronization and is potentially serious depending on how it is manifested. The fix involves changing the scope of the database transaction as well as synchronization. There are currently many issues transaction/synchronization issues in the state layer that need to be addresses, this only deals with this exact use case. Also, this patch strictly deals with correctness and I didn't make an effort to optimize this path. If this results is a performance regression, there are several approaches that we could take. Diffs ----- ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java 792b6fe Diff: https://reviews.apache.org/r/34821/diff/ Testing (updated) ------- Provison clusters many times via looking for a reported database deadlock. Without this patch, I was able to reproduce the deadlock fairly consistently and with the patch no deadlock occurred across many installs. Unit Tests: - tx/synchronization change only so no new unit test - currently running full unit test suite and will update review with result summary when completed Results : Tests run: 3020, Failures: 0, Errors: 0, Skipped: 21 ... Total run:744 Total errors:0 Total failures:0 Thanks, John Speidel --===============7364910032985952002==--