Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 69DDE200C8E for ; Thu, 8 Jun 2017 19:23:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 686DA160BD5; Thu, 8 Jun 2017 17:23:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id ACE76160BC3 for ; Thu, 8 Jun 2017 19:23:13 +0200 (CEST) Received: (qmail 6163 invoked by uid 500); 8 Jun 2017 17:23:12 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 6147 invoked by uid 99); 8 Jun 2017 17:23:12 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Jun 2017 17:23:12 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 29706180688 for ; Thu, 8 Jun 2017 17:23:12 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.363 X-Spam-Level: X-Spam-Status: No, score=0.363 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id qrrZgYEI9bp7 for ; Thu, 8 Jun 2017 17:23:11 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 4AD995FB71 for ; Thu, 8 Jun 2017 17:23:10 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v58HN9FC011979; Thu, 8 Jun 2017 17:23:09 GMT Date: Thu, 8 Jun 2017 17:23:09 +0000 From: "Sailesh Mukil (Code Review)" To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org Message-ID: Reply-To: sailesh@cloudera.com X-Gerrit-MessageType: newchange Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-5221=3A_Fix_TSaslTransport_negotiation_order=0A?= X-Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80 X-Gerrit-ChangeURL: X-Gerrit-Commit: 250e8af46e46df3422d9ec73c7d46abba54b52b1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.7 archived-at: Thu, 08 Jun 2017 17:23:14 -0000 Sailesh Mukil has uploaded a new change for review. http://gerrit.cloudera.org:8080/7116 Change subject: IMPALA-5221: Fix TSaslTransport negotiation order ...................................................................... IMPALA-5221: Fix TSaslTransport negotiation order The TSaslTransport is written as a thrift extension that is a wrapper around the Cyrus-SASL APIs. This transport is then used by Impala's RPC layer. On RHEL7 systems that use newer versions of the Cyrus-SASL library, we noticed that we sometimes crash inside the Cyrus-SASL thirdparty while trying to lock an internal mutex. During my investigation, I found that we needed to fix the order of negotiation that happens in an edge case. The steps to use the Cyrus-SASL APIs for SASL negitiation are the following (Replace '_client_' with '_server_' for server calls): sasl_client_new() sasl_client_start() sasl_client_step() sasl_client_dispose() < --- When we're done with the connection. sasl_client_new() was being called in the constructor TSaslClient() which is invoked from SaslAuthProvider::WrapClientTransport(). sasl_client_start() and sasl_client_step() were being called under TSaslTransport::open(). If for some reason we hit an error during SASL negotiation, the TSaslTransport::open() call would fail. When we fail, the ThriftClientImpl::OpenWithRetry() does a retry, which directly retries the negotiation from sasl_client_start(). This caused the use of already freed resources from the first negotiation failure, hence causing the crash. To fix this, we make sure that on a negotiation failure, we dispose of all the resources properly by calling sasl_dispose() and retry the negotiation from the start by calling sasl_client_new() first, and then the remaining steps. This is done by moving the sasl_client_new() and sasl_server_new() calls out of the TSaslClient/TSaslServer constructors and into a new call called TSasl*::setupNegotiation(), which is called under TSaslTransport::open(). The patch is fairly large for the above mentioned change, however, most of it is just plumbing. Testing: Tested on systems with older SASL versions to make sure we don't regress. Also tested on systems with newer SASL versions where we previously saw the crash and verified that we don't see them anymore. Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80 --- M be/src/transport/TSasl.cpp M be/src/transport/TSasl.h M be/src/transport/TSaslClientTransport.cpp M be/src/transport/TSaslClientTransport.h M be/src/transport/TSaslServerTransport.cpp M be/src/transport/TSaslServerTransport.h M be/src/transport/TSaslTransport.cpp M be/src/transport/TSaslTransport.h 8 files changed, 222 insertions(+), 78 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/7116/1 -- To view, visit http://gerrit.cloudera.org:8080/7116 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I7d509e14dfe46fa28d0626cf84daf6de82955d80 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Sailesh Mukil