Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 60993200B39 for ; Sat, 25 Jun 2016 01:51:52 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5F07B160A5A; Fri, 24 Jun 2016 23:51:52 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 7F77B160A58 for ; Sat, 25 Jun 2016 01:51:51 +0200 (CEST) Received: (qmail 61411 invoked by uid 500); 24 Jun 2016 23:51:50 -0000 Mailing-List: contact dev-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Developers List" Delivered-To: mailing list dev@tomcat.apache.org Received: (qmail 61400 invoked by uid 99); 24 Jun 2016 23:51:50 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2016 23:51:50 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id E81091A5D8B for ; Fri, 24 Jun 2016 23:51:49 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.821 X-Spam-Level: X-Spam-Status: No, score=-0.821 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id WNMVzNmkRQsO for ; Fri, 24 Jun 2016 23:51:48 +0000 (UTC) Received: from mail-vk0-f46.google.com (mail-vk0-f46.google.com [209.85.213.46]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 7921A5F21F for ; Fri, 24 Jun 2016 23:51:48 +0000 (UTC) Received: by mail-vk0-f46.google.com with SMTP id j2so170245298vkg.2 for ; Fri, 24 Jun 2016 16:51:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=79fOBixIbFmw+8/TetH5Ejs8IQq0gLEq91kzfUwJFps=; b=ZDNbAxGYSo9myxdQ83l3vgZU7UiUVnevbqSzLawNaEVUhnxiL7F+FfT3ZJ8H11wmw6 8lgFj7tlewR3OO7zjRaC5ineBYL02uxBKkxVk+qgI0rT7CNJmK84Ik+9epCf32AjiI/b 31cqh7UAR7YWVa9lv+O531m1bffIIuPgIruSwv6bao+gi8x2BtP/xKR3xCWnQudbmar4 9S0ExjCg9Z1ws7EXS5PuoOW8DUnrG0U9jQefodBkl+xv/UCTjDOAvTyt5fy7SscTwpNf FIW3v0CwN1ebqSherr0g9CLNk6y3Ptxsc+at3064MmqqDoxmTyXjSxl8he5exNbdFEnw PKFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=79fOBixIbFmw+8/TetH5Ejs8IQq0gLEq91kzfUwJFps=; b=Od5k1dMVBSxdjnJ7ZO7/WUzZOUt9hXLA/8zlSwsIQWeoWdV2Mla9tOxty8c/rBW7Fi wmcUi092shPKJAa6ajAfNuSJ7V1qRJ+2+ZmCcMAvmoBCdtXomQgM7tDqNkruQ7NAgfnM kuJF8LykoDLlJo9KD3K6Z/uovjpFhRrPlvucmOkQ5WKnqr0ogotfCZrtHjZCIHwEH9Kj oyPT9Upsusqr+7XRa/1E2zqXd59f4h8FaamWNAFZgPL0Fl+e5YsCRHHJydGts7p8fLoH l8VX5Tni9kDtXzAkKdSfHOyM+jMsC725B6lLAATXEikvXTNvJRIpEZyiLsF+w65ZNWV7 eHPQ== X-Gm-Message-State: ALyK8tISBriPayBQ+UEAx4OIx4WOga0Z++DE/Q8TD6qE/RWnBfBoqlLxdZ91LlpVRhO79XFrpq9qkOixq1RaTQ== X-Received: by 10.176.6.36 with SMTP id f33mr3893602uaf.145.1466812307350; Fri, 24 Jun 2016 16:51:47 -0700 (PDT) MIME-Version: 1.0 Received: by 10.31.223.2 with HTTP; Fri, 24 Jun 2016 16:51:46 -0700 (PDT) In-Reply-To: <4ec22aa8-a9f4-2f03-ce04-023f3bfd6548@apache.org> References: <2e6f56c1-32cb-28c6-3864-7425aa973069@apache.org> <7afb790c-67dc-7dd5-56a0-78961479f571@apache.org> <68f53a04-aa52-ce2f-43d7-9a9188b43116@apache.org> <8c8ee067-bdda-b6e1-e189-19e36791ddbb@apache.org> <4b3edb6e-5649-0d82-cbbe-306fd7e5f8bf@apache.org> <4ec22aa8-a9f4-2f03-ce04-023f3bfd6548@apache.org> From: therealneworld@gmail.com Date: Fri, 24 Jun 2016 19:51:46 -0400 Message-ID: Subject: Re: Bug that spans tomcat and tomcat-native To: Tomcat Developers List Content-Type: text/plain; charset=UTF-8 archived-at: Fri, 24 Jun 2016 23:51:52 -0000 On Fri, Jun 24, 2016 at 5:31 PM, Mark Thomas wrote: > On 24/06/2016 21:52, therealneworld@gmail.com wrote: > >>> I'm wondering if this is a problem that builds up over time. If I >>> understood your previous posts correctly, running the big tests >>> immediately gave ~700MB/s whereas running the small tests then the big >>> tests resulting in ~350MB/s during the big tests. Are you able to >>> experiment with this a little bit? For example, if you do big tests, 1M >>> (~20%) small tests, big tests, 1M small tests, big tests etc. What is >>> the data rate for the big tests after 0, 1M, 2M, 3M, 4M and 5M little tests. >> >> Sure I can try that. For the in between tests do you want me to run >> those for a set amount of time or number of files? Like each smaller >> batch like 20min and then 10min of large and then next smaller size? > > I was thinking set number of files. > > I would also be useful to know how many threads the executor has created > at each point as well. (JMX should tell you that. You might need to > adjust the executor so it doesn't stop idle threads.). I saw your message about not stopping idle threads after I already started things. 1st 100M test: 851348MB/s Executor: largestPoolSize: 25 poolSize: 25 1st 4k test Executor: largestPoolSize: 401 poolSize: 401 2nd 100M test: 460147MB/s Executor: largestPoolSize: 401 poolSize: 79 2nd 4k test Executor: largestPoolSize: 414 poolSize: 414 3rd 100M test: 429127MB/s Executor: largestPoolSize: 414 poolSize: 80 3rd 4k test: Executor: largestPoolSize: 414 poolSize: 401 4th 100M test: 437918MB/s Executor: largestPoolSize: 414 poolSize: 86 4th 4k test: Executor: largestPoolSize: 414 poolSize: 401 5th 100M test: 464837MB/s Executor: largestPoolSize: 414 poolSize: 87 It looks like the problem occurs right after the first set of 4k puts and doesn't get any worse so what ever causes the issue happens early. This is getting stranger and I really can not explain why calling ERR_remove_thread_state reliably improves performance. > Going back to your original description, you said you saw blocking > during the call to ERR_clear_err(). Did you mean ERR_clear_error()? > Either way, could you provide the full stack trace of an example blocked > thread? And, ideally, the stack trace of the thread currently holding > the lock? I'm still trying to understand what is going on here since > based on my understanding of the code so far, the HashMap is bounded (to > the number of threads) and should reach that limit fairly quickly. Sorry, yes I did mean ERR_clear_error(). #0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f0f61ab989d in __GI___pthread_mutex_lock (mutex=0x2b49c58) at ../nptl/pthread_mutex_lock.c:80 #2 0x00007f0f3205f183 in int_thread_get (create=0) at err.c:446 #3 0x00007f0f3205f68d in int_thread_get_item (d=0x7f0ca89c7ce0) at err.c:491 #4 0x00007f0f32060094 in ERR_get_state () at err.c:1014 #5 0x00007f0f320602cf in ERR_clear_error () at err.c:747 #6 0x00007f0f325f3579 in ssl_socket_recv (sock=0x7f0dcc391980, buf=0x7f0eec067820 "lock->199808-Source_filename->rhino_perf_https_lt_100g_a-Loop->1-Count->11089487-11089488-11089489-11089490-11089491-11089492-11089493-11089494-11089495-11089496-11089497-11089498-11089499-11089500-11"..., len=0x7f0ca89c7ff0) at src/sslnetwork.c:401 #7 0x00007f0f325ece99 in Java_org_apache_tomcat_jni_Socket_recvbb (e=, o=, sock=, offset=, len=) at src/network.c:957 I tried getting more data but the jvm tends to dump core when gdb is attached or is going to slow to actually cause the lock contention. I can reliably see a thread waiting on this lock if I attach to a single thread and randomly interrupt it and look at the back trace. When I look at the mutex it has a different owner each time so different threads are getting the lock. I will play with this a bit more on Monday. -nate --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org For additional commands, e-mail: dev-help@tomcat.apache.org