Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1CB25200B91 for ; Thu, 29 Sep 2016 17:33:25 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 1B61C160AE3; Thu, 29 Sep 2016 15:33:25 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5B407160AC1 for ; Thu, 29 Sep 2016 17:33:24 +0200 (CEST) Received: (qmail 19571 invoked by uid 500); 29 Sep 2016 15:33:23 -0000 Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@impala.incubator.apache.org Received: (qmail 19555 invoked by uid 99); 29 Sep 2016 15:33:22 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Sep 2016 15:33:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 5F3341A0ABF for ; Thu, 29 Sep 2016 15:33:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.362 X-Spam-Level: X-Spam-Status: No, score=0.362 tagged_above=-999 required=6.31 tests=[RDNS_DYNAMIC=0.363, SPF_PASS=-0.001] autolearn=disabled Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Dv8l7xj-YJOk for ; Thu, 29 Sep 2016 15:33:20 +0000 (UTC) Received: from ip-10-146-233-104.ec2.internal (ec2-75-101-130-251.compute-1.amazonaws.com [75.101.130.251]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id AED655FADE for ; Thu, 29 Sep 2016 15:33:19 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by ip-10-146-233-104.ec2.internal (8.14.4/8.14.4) with ESMTP id u8TFXI3L002508; Thu, 29 Sep 2016 15:33:18 GMT Message-Id: <201609291533.u8TFXI3L002508@ip-10-146-233-104.ec2.internal> Date: Thu, 29 Sep 2016 15:33:18 +0000 From: "Henry Robinson (Code Review)" To: Thomas Tauber-Marshall , impala-cr@cloudera.com, reviews@impala.incubator.apache.org CC: Matthew Jacobs , Juan Yu , Sailesh Mukil Reply-To: henry@cloudera.com X-Gerrit-MessageType: comment Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-4135=3A_Thrift_threaded_server_times-out_connections_during_high_load=0A?= X-Gerrit-Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d X-Gerrit-ChangeURL: X-Gerrit-Commit: 83f1c25fe1a508b64f02ea7ffd06bdb9a0171c6f In-Reply-To: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Content-Disposition: inline User-Agent: Gerrit/2.12.2 archived-at: Thu, 29 Sep 2016 15:33:25 -0000 Henry Robinson has posted comments on this change. Change subject: IMPALA-4135: Thrift threaded server times-out connections during high load ...................................................................... Patch Set 3: (8 comments) Some comments before I head off to Strata. I would run clang-format over the new thrift files. That way they'll be more readable to Impala developers, while keeping the structure and idioms of the thrift code. http://gerrit.cloudera.org:8080/#/c/4519/3/be/src/rpc/TAcceptQueueServer.cpp File be/src/rpc/TAcceptQueueServer.cpp: PS3, Line 204: 1 this is an important enough parameter that I'd make a constant for it, and put your comments about only using thread on that. We don't want someone changing the number of threads without understanding the thread-safety implications. PS3, Line 216: acceptPool.Offer returned false. If this returns false it's because the queue is shut down. Better to say that clearly in the error message so users have an idea what this might mean. Also you can say that it's unexpected - we never shut down our servers so the queue should never be shut down. http://gerrit.cloudera.org:8080/#/c/4519/3/be/src/rpc/TAcceptQueueServer.h File be/src/rpc/TAcceptQueueServer.h: Line 46: * connections will time out while in the OS accept queue. add a reference to IMPALA-4135 here. PS3, Line 105: std::mutex big_lock_; Is this used anywhere? http://gerrit.cloudera.org:8080/#/c/4519/3/be/src/rpc/thrift-server-test.cc File be/src/rpc/thrift-server-test.cc: Line 174: ThreadPool pool("group", "test", 256, 10000, [](int tid, const int64_t& item) { Add a comment about what you're testing and the associated JIRA. PS3, Line 178: ASSERT_OK(status); Does this work in a threaded context - does the test fail if the thread hits an ASSERT? PS3, Line 180: for (int i = 0; i < 1024 * 16; ++i) pool.Offer(i); I'm a bit concerned about failing with ulimit errors, particularly because both client and server should be in the same process. Have you ever seen this repro with fewer concurrent connections - say 8K? PS3, Line 173: TEST(ConcurrencyTest, ConnectTimeout) { : ThreadPool pool("group", "test", 256, 10000, [](int tid, const int64_t& item) { : using Client = ThriftClient; : Client* client = new Client("127.0.0.1", 22000, "", NULL, false); : Status status = client->Open(); : ASSERT_OK(status); : }); : for (int i = 0; i < 1024 * 16; ++i) pool.Offer(i); : pool.DrainAndShutdown(); : } This relies on a running impala internal service - you probably had an Impala process running when you ran this? That won't be true in our builds. Use the same server code that all the other tests use to start a server in this process. -- To view, visit http://gerrit.cloudera.org:8080/4519 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ie50e728974ef31a9d49132a0b3f7cde2a4f3356d Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Juan Yu Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Thomas Tauber-Marshall Gerrit-HasComments: Yes