Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 6A0B9200CAD for ; Tue, 6 Jun 2017 08:12:25 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6939D160BD4; Tue, 6 Jun 2017 06:12:25 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id B24FE160BE4 for ; Tue, 6 Jun 2017 08:12:24 +0200 (CEST) Received: (qmail 61772 invoked by uid 500); 6 Jun 2017 06:12:23 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 61532 invoked by uid 99); 6 Jun 2017 06:12:23 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Jun 2017 06:12:23 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 1C26CC0112 for ; Tue, 6 Jun 2017 06:12:23 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.363 X-Spam-Level: X-Spam-Status: No, score=0.363 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id FnuiWUOkxtdm for ; Tue, 6 Jun 2017 06:12:20 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 401965F5B3 for ; Tue, 6 Jun 2017 06:12:20 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id v566CIpg012318; Tue, 6 Jun 2017 06:12:18 GMT Message-Id: <201706060612.v566CIpg012318@ip-10-146-233-104.ec2.internal> Date: Tue, 6 Jun 2017 06:12:18 +0000 From: "Michael Ho (Code Review)" To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Henry Robinson , Sailesh Mukil , Juan Yu , Alan Choi , Tim Armstrong , Dan Hecht Reply-To: kwho@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-5388=3A_Only_retry_RPC_on_lost_connection_in_send_call=0A?= X-Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b X-Gerrit-ChangeURL: X-Gerrit-Commit: f68d9ef87df381bbe0ca61c05774213bb3b0bc2a In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.7 archived-at: Tue, 06 Jun 2017 06:12:25 -0000 Michael Ho has posted comments on this change. Change subject: IMPALA-5388: Only retry RPC on lost connection in send call ...................................................................... Patch Set 6: (11 comments) http://gerrit.cloudera.org:8080/#/c/7063/4/be/src/rpc/thrift-util.cc File be/src/rpc/thrift-util.cc: PS4, Line 183: strstr(e.what(), "EAGA > should we also retry for "send time out"? I mean the one throw by write(). See comments below. Line 184: } > if we upgrade thrift, won't we need to recheck these? how will we remember Added a DCHECK for the thrift version string. Line 188: "to match the current version of TSocket.cpp"; > That would also be accurate. I think both descriptions are correct. In that sense, we should also retry on the exception below in write() as Juan mentioned above since we know we haven't completely finished the write in this case: while (sent < len) { uint32_t b = write_partial(buf + sent, len - sent); if (b == 0) { // This should only happen if the timeout set with SO_SNDTIMEO expired. // Raise an exception. throw TTransportException(TTransportException::TIMED_OUT, "send timeout expired"); } sent += b; } http://gerrit.cloudera.org:8080/#/c/7063/4/be/src/runtime/client-cache.h File be/src/runtime/client-cache.h: PS4, Line 298: return strings::Substitute("Client $0 hits unexpected exception: $1 type= $2", : client_, e.what(), typeid(e).name()); : } : > nit: prefer Substitute() over stringstream Done http://gerrit.cloudera.org:8080/#/c/7063/4/be/src/testutil/fault-injection-util.cc File be/src/testutil/fault-injection-util.cc: PS4, Line 51: if > remove std:: Actually needed. PS4, Line 67: throw TTransportException(TTransportExcept > maybe DCHECK? Otherwise could be confusing if this silently fails to work a Done Line 77: "Called read on non-open socket"); > Isn't this branch supposed to be one level higher - when !is_send? Oops...Fixed. http://gerrit.cloudera.org:8080/#/c/7063/4/be/src/testutil/fault-injection-util.h File be/src/testutil/fault-injection-util.h: PS4, Line 48: inject > grammar ("injects delays" maybe?) Done PS4, Line 55: inject > injects an exception in a specified Done http://gerrit.cloudera.org:8080/#/c/7063/4/tests/custom_cluster/test_rpc_exception.py File tests/custom_cluster/test_rpc_exception.py: PS4, Line 27: \ > not needed It's actually needed in python for line continuation. Did I misunderstand your comment ? PS4, Line 33: e > indent python two spaces, not four Done -- To view, visit http://gerrit.cloudera.org:8080/7063 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I176975f2aa521d5be8a40de51067b1497923d09b Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho Gerrit-Reviewer: Alan Choi Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes