From users-return-265509-archive-asf-public=cust-asf.ponee.io@tomcat.apache.org Mon Sep 10 18:33:37 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 054FD180656 for ; Mon, 10 Sep 2018 18:33:36 +0200 (CEST) Received: (qmail 6963 invoked by uid 500); 10 Sep 2018 16:33:35 -0000 Mailing-List: contact users-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Users List" Delivered-To: mailing list users@tomcat.apache.org Received: (qmail 6952 invoked by uid 99); 10 Sep 2018 16:33:35 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Sep 2018 16:33:35 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id AE4D7C1FCE for ; Mon, 10 Sep 2018 16:33:34 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.652 X-Spam-Level: X-Spam-Status: No, score=0.652 tagged_above=-999 required=6.31 tests=[RCVD_IN_DNSWL_NONE=-0.0001, SPF_NEUTRAL=0.652] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id c7ShXZKdI2lD for ; Mon, 10 Sep 2018 16:33:33 +0000 (UTC) Received: from smtp125.iad3b.emailsrvr.com (smtp125.iad3b.emailsrvr.com [146.20.161.125]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 31CE85F118 for ; Mon, 10 Sep 2018 16:33:33 +0000 (UTC) Received: from smtp8.relay.iad3b.emailsrvr.com (localhost [127.0.0.1]) by smtp8.relay.iad3b.emailsrvr.com (SMTP Server) with ESMTP id 580A44006D for ; Mon, 10 Sep 2018 12:33:27 -0400 (EDT) X-Auth-ID: mitch@claborn.net Received: by smtp8.relay.iad3b.emailsrvr.com (Authenticated sender: mitch-AT-claborn.net) with ESMTPSA id 2BD9F4005C for ; Mon, 10 Sep 2018 12:33:27 -0400 (EDT) X-Sender-Id: mitch@claborn.net Received: from [192.168.45.5] ([UNAVAILABLE]. [63.249.40.11]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA) by 0.0.0.0:465 (trex/5.7.12); Mon, 10 Sep 2018 12:33:27 -0400 Subject: Re: Cluster session replication performance To: users@tomcat.apache.org References: <45c45065-4a5a-ed1f-7d6b-d42ee9a55a5d@claborn.net> From: Mitch Claborn Message-ID: <36decbfb-9b5f-b783-c1d2-f79c6016bc19@claborn.net> Date: Mon, 10 Sep 2018 11:33:26 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <45c45065-4a5a-ed1f-7d6b-d42ee9a55a5d@claborn.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Further information and questions. I created my own interceptor based on ThroughputInterceptor so that I could log the timing of specific sessions to correlate them with the failures in my health check program. I was surprised to find that in those instances where the health check reported a failure, the interceptor reported that the session send was accomplished in < 5 ms, while the health check app is waiting a full 1000 ms between calls to the different tomcat instances. So now I'm more confused than ever. Anyone have any ideas? In a ChannelInterceptor, does when getNext().sendMessage(destination, msg, payload) returns, does that mean that the message has been sent AND received by the recipient member, or does that only indicate a send? Mitch On 09/06/2018 01:53 PM, Mitch Claborn wrote: > I'm using a cluster with the DeltaManager between two servers on Tomcat > 9.0.11. I've set channelSendOptions="8" (asynchronous session replication). > > I have a "health check" app that I run periodically, one of the > functions being to check that sessions are being replicated properly. > That app > 1) Does a GET to tomcat A, calling a Struts action that creates a > session and stores a known value in it > 2) Waits 2 seconds > 3) Uses the session ID cookie from step 1 and makes a call to tomcat B, > to an action that retrieves that value from the session > 4) Compares the two values from the session to make sure that they are > the same. > > Most of the time this check works fine, but occasionally the call to the > second server will find that the session does not exist on that server, > presumably because it has not yet replicated there yet. 2 seconds seems > a long time for a session to replicate, especially one as small as this > one is. If I decrease the amount of wait time at step 2, the failure > rate increases. > > I turned on the ThroughputInterceptor and have the following observations. > - Server A has a transmit throughput around 10 MB/sec while B has only > around 3 MB/sec. This might be accounted for by the fact that B was the > last server to start, so A would have (I think) transmitted all of the > sessions at once when B started up, so it might get good throughput from > the big send?? > > Questions: > 1. IS 2 seconds a long time to replicate a session? > 2. Other than actual network slowness, are there internal issues that > could cause the replication to be slow? > 3. If so, is there anyway to diagnose those? > 4. I'm thinking about writing my own version of ThroughputInterceptor > that will give more information on specific messages and timings. Has > anyone tried that? In that interceptor can I access the session ID? That > would help me correlate timings between my failure reports and the > interceptor. > > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org For additional commands, e-mail: users-help@tomcat.apache.org