Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 904E52004CA for ; Wed, 11 May 2016 16:22:47 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8EDAE160A14; Wed, 11 May 2016 14:22:47 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8BE9F160A09 for ; Wed, 11 May 2016 16:22:46 +0200 (CEST) Received: (qmail 69635 invoked by uid 500); 11 May 2016 14:22:45 -0000 Mailing-List: contact dev-help@cloudstack.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cloudstack.apache.org Delivered-To: mailing list dev@cloudstack.apache.org Received: (qmail 69622 invoked by uid 99); 11 May 2016 14:22:45 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 May 2016 14:22:45 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id ABB9FC1810 for ; Wed, 11 May 2016 14:22:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.18 X-Spam-Level: * X-Spam-Status: No, score=1.18 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com header.b=LEPDvCQc; dkim=pass (1024-bit key) header.d=cloudops.com header.b=fxK8ImLk Received: from mx2-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 0w2EskYiqdzm for ; Wed, 11 May 2016 14:22:41 +0000 (UTC) Received: from mail-oi0-f54.google.com (mail-oi0-f54.google.com [209.85.218.54]) by mx2-lw-eu.apache.org (ASF Mail Server at mx2-lw-eu.apache.org) with ESMTPS id 1BFA55F24C for ; Wed, 11 May 2016 14:22:40 +0000 (UTC) Received: by mail-oi0-f54.google.com with SMTP id x201so69880545oif.3 for ; Wed, 11 May 2016 07:22:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to; bh=WTGz03gGcJjJ+4SBtwSN3ztP4t0J49BUpwng6fMaSwM=; b=LEPDvCQc0oWNwAFQRLOTj11gqR3FmHjbt58jkkJNTDj1X6Q0+pWCZNLqHNR5exV6Rk +UsDw12GjiTdxAeh6qSLEX1B5s4hucPpPfJ3HFOc+MbX5PF7/tbIbPyAdqVbZ5yL53vK y69UUTd6KBbtCeb/2xbEijFKVKuJUiA8zl96o3EZJUJeaIlHQRtxKzqMmnGPWqd+xerr KryWQ+QI0sX6ar1dPOaOL/Z/GzgyoyCC7xuaVvXqnUTjPn4ZCWzLI2cMwU2CY97yZ3cm MKCnkeAVDqsefYV5HqOgniGMcOeat+uxTRoJobG3vCINYoJ1YBkTidoc9tByFjkadjxC LjPA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudops.com; s=google; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to; bh=WTGz03gGcJjJ+4SBtwSN3ztP4t0J49BUpwng6fMaSwM=; b=fxK8ImLkkuBHDJ1V6GmymTp25FeBuy5ieENKXAVd5/UxhxEBw0/JwtOlbagYnMO44d grqwOTAmWCJjG+f0pm+FfUZxnIKSlNp/jEbNmczBeiJVdgD+xZq4DivzBt2YfGLNPqzR nGMq6b0WSh6DcKxug/QQPJ4AZbEjDx9UfaOR0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to; bh=WTGz03gGcJjJ+4SBtwSN3ztP4t0J49BUpwng6fMaSwM=; b=Wjr1+ZFj/MNFNQ91AV83wXto61C2T5BAjHFOCkm6Li2tq2LbGeNs9xgyb0QU+oiml4 obgaXhUrNIzpqd97lzQwZd5pnYbZ92qgwOPBFRPHIxsV+6fCdnGMerwvcKEi+yensVp2 Fw7F7OmA3ArELlesnIdI+bAQkiHp94CDhpjGKekZl/uLW7r2ip7B+7OgMDI6rbtD55J0 P68WACYRjLJddBRS87oj91qpnFowoOCrgUVqQ4PWnHTcfLKKS2PxfN/d+q0tZp5vQ2la iCSJtF8Z5n3iEUZ2TlAYB34BgHu06Rn6vMTf0wM6qlobluJih5VMy1aO2dA2mRggv/Z2 Oalg== X-Gm-Message-State: AOPr4FXmJ4wFlyJZfPmaora0C5qiX+cZQXjxopvDCS95pxMTooVZpJ2tpRkEzUxA3+U+g8lpk0m+U3J8iTJpgA== MIME-Version: 1.0 X-Received: by 10.202.185.214 with SMTP id j205mr1910674oif.68.1462976558914; Wed, 11 May 2016 07:22:38 -0700 (PDT) Sender: williamstevens@gmail.com Received: by 10.202.73.17 with HTTP; Wed, 11 May 2016 07:22:38 -0700 (PDT) In-Reply-To: References: <1462829676099.82201@netapp.com> <1obbhssmzgt8134agwtqrer8a-0@mailer.nylas.com> <30D75B1A-BA72-4EEF-856D-29A755BF6D4F@netapp.com> Date: Wed, 11 May 2016 10:22:38 -0400 X-Google-Sender-Auth: Zt-mbALFRcIiYPh8Olp4YyXhe0Y Message-ID: Subject: Re: Test failure on master? From: Will Stevens To: "dev@cloudstack.apache.org" , Simon Weller Content-Type: multipart/alternative; boundary=001a113cd1fa1beb73053291c676 archived-at: Wed, 11 May 2016 14:22:47 -0000 --001a113cd1fa1beb73053291c676 Content-Type: text/plain; charset=UTF-8 I can't point to this PR for this, but I have noticed an increase in random failures in my CI runs since this code has gone in. I have not tracked it down to this, but resources are tight in those environments, so if we are getting CPU maxing out, that could potentially account for the higher failure rates in my CI environments... *Will STEVENS* Lead Developer *CloudOps* *| *Cloud Solutions Experts 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6 w cloudops.com *|* tw @CloudOps_ On Wed, May 11, 2016 at 10:17 AM, Will Stevens wrote: > Rohit, I have seen quite a few issues with this feature so far. The > change you made in #1538 does not change the actual code at all, it just > reduces the number of tests, so you are less likely to run into the > problem, but the problem still exists. > > I am CCing in Simon Weller as well. I was talking to him this morning and > he had this to say (unprompted). > > Will, We're still seeing odd issues with that NIO SSL concurrency patch >> (1493), even after pulling in the additional PR 1534. The latest problem >> we've seen is 100% cpu on the agents for no apparent reason. I reverted >> both patches from our QA lab this morning and the problem has gone away. > > > I pulled it into a second lab where we have haproxy setup to load balance >> and the same behaviour occurs > > > top - 08:18:15 up 1 day, 17:08, 5 users, load average: 1.92, 2.22, 2.09 >> Tasks: 223 total, 1 running, 222 sleeping, 0 stopped, 0 zombie >> %Cpu(s): 22.2 us, 11.9 sy, 0.0 ni, 65.8 id, 0.0 wa, 0.0 hi, 0.1 si, >> 0.0 st >> KiB Mem : 32673608 total, 28312176 free, 3512104 used, 849328 >> buff/cache >> KiB Swap: 4194300 total, 4194300 free, 0 used. 28757568 avail Mem >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ >> COMMAND >> >> 17985 root 20 0 6937720 162816 22196 S 100.3 0.5 3:24.84 >> /usr/lib/jvm/jre/bin/java -Xms256m -Xmx2048m -cp >> /usr/share/cloudstack-agent/lib/activatio+ >> 15587 root 20 0 1733288 375976 12164 S 100.0 1.2 10:42.36 >> /usr/libexec/qemu-kvm -name v-46-VM -S -machine >> pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 1+ >> 4480 root 20 0 909604 305292 12264 S 0.7 0.9 1:10.21 >> /usr/libexec/qemu-kvm -name r-44-VM -S -machine >> pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 2+ >> 5188 root 20 0 957548 323420 12216 S 0.7 1.0 1:07.35 >> /usr/libexec/qemu-kvm -name r-45-VM -S -machine >> pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 2+ >> 18336 root 20 0 157840 2392 1556 R 0.7 0.0 0:00.14 top >> >> >> 19023 root 20 0 1002156 449720 12372 S 0.7 1.4 10:57.69 >> /usr/libexec/qemu-kvm -name r-32-VM -S -machine >> pc-i440fx-rhel7.0.0,accel=kvm,usb=off -m 2+ > > > I am considering reverting this feature (both PRs) until we can > understand what is causing this and we can stabilize this code so it does > not cause us problems. With this type of behavior, I am not confident with > this code in production right now... > > *Will STEVENS* > Lead Developer > > *CloudOps* *| *Cloud Solutions Experts > 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6 > w cloudops.com *|* tw @CloudOps_ > > On Wed, May 11, 2016 at 5:36 AM, Rohit Yadav > wrote: > >> Please follow up on PR #1538 and comment if that fixes the issue on OSX. >> >> Regards. >> >> Regards, >> >> Rohit Yadav >> >> rohit.yadav@shapeblue.com >> www.shapeblue.com >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK >> @shapeblue >> >> -----Original Message----- >> From: Rohit Yadav [mailto:rohit.yadav@shapeblue.com] >> Sent: Wednesday, May 11, 2016 2:49 PM >> To: dev@cloudstack.apache.org >> Subject: RE: Test failure on master? >> >> I don't have OSX, but it seems to be working on Travis and Linux env in >> general. >> I'll send a PR that relaxes malicious client attacks, and ask you to >> review in your env -- Koushik and Mike. >> >> Regards, >> >> Rohit Yadav >> >> rohit.yadav@shapeblue.com >> www.shapeblue.com >> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue >> >> -----Original Message----- >> From: Koushik Das [mailto:koushik.das@accelerite.com] >> Sent: Wednesday, May 11, 2016 12:22 PM >> To: dev@cloudstack.apache.org >> Subject: Re: Test failure on master? >> >> I am also seeing the same failure happening randomly. OS X El Capitan >> 10.11.4. >> >> Results : >> >> Tests in error: >> NioTest.testConnection:152 > TestTimedOut test timed out after 60000 >> milliseco... >> >> Tests run: 200, Failures: 0, Errors: 1, Skipped: 13 >> >> >> ________________________________________ >> From: Tutkowski, Mike >> Sent: Tuesday, May 10, 2016 6:31:23 PM >> To: dev@cloudstack.apache.org >> Subject: Re: Test failure on master? >> >> Oh, and it's the OS of my MacBook Pro. >> >> > On May 10, 2016, at 6:59 AM, Tutkowski, Mike >> wrote: >> > >> > Hi, >> > >> > The environment is Mac OS X El Capitan 10.11.4. >> > >> > Thanks! >> > Mike >> > >> >> On May 10, 2016, at 5:51 AM, Will Stevens >> wrote: >> >> >> >> I think I can verify that this is still happening on master for him >> >> because you changed the timeout (and the number of tests run, etc) >> >> when you pushed the fix in #1534. So by looking at the timeout of >> >> 60000, we can verify that it is the latest code from master being run. >> >> >> >> I do think we need to revisit this to make sure we don't have >> >> intermittent issues with this test. >> >> >> >> Thx guys... >> >> >> >> *Will STEVENS* >> >> Lead Developer >> >> >> >> *CloudOps* *| *Cloud Solutions Experts >> >> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6 w cloudops.com *|* tw >> >> @CloudOps_ >> >> >> >> On Tue, May 10, 2016 at 7:41 AM, Rohit Yadav >> >> >> >> wrote: >> >> >> >>> Mike, >> >>> >> >>> Can you comment if you're using latest master. Can you also share >> >>> the environment where you're running this (in a VM, automated by >> >>> Jenkins, Java version etc)? >> >>> >> >>> Will - I think the issue should be fixed on latest master, but if >> >>> Mike and others are getting failures I can further relax the test. >> >>> In virtualized environments, there may be threading/scheduling issues. >> >>> >> >>> Regards, >> >>> Rohit Yadav >> >>> >> >>> >> >>> Regards, >> >>> >> >>> Rohit Yadav >> >>> >> >>> rohit.yadav@shapeblue.com >> >>> www.shapeblue.com >> >>> 53 Chandos Place, Covent Garden, London WC2N 4HSUK @shapeblue On >> >>> May 10 2016, at 3:20 am, Will Stevens wrote: >> >>> >> >>> Rohit, can you look into this. >> >>> >> >>> It was first introduced in: >> >>> https://github.com/apache/cloudstack/pull/1493 >> >>> >> >>> I thought the problem was fixed with this: >> >>> https://github.com/apache/cloudstack/pull/1534 >> >>> >> >>> Apparently we still have a problem. This is intermittently emitting >> >>> false negatives from what I can tell... >> >>> >> >>> *Will STEVENS* >> >>> Lead Developer >> >>> >> >>> *CloudOps* *| *Cloud Solutions Experts >> >>> 420 rue Guy *|* Montreal *|* Quebec *|* H3J 1S6 w cloudops.com *|* >> >>> tw @CloudOps_ >> >>> >> >>> On Mon, May 9, 2016 at 5:34 PM, Tutkowski, Mike >> >>> > >>> wrote: >> >>> >> >>>> ?Hi, >> >>>> >> >>>> >> >>>> I've seen this a couple times today. >> >>>> >> >>>> >> >>>> Is this a known issue? >> >>>> >> >>>> >> >>>> Results : >> >>>> >> >>>> >> >>>> Tests in error: >> >>>> >> >>>> NioTest.testConnection:152 > TestTimedOut test timed out after >> >>>> 60000 milliseco... >> >>>> >> >>>> >> >>>> Tests run: 200, Failures: 0, Errors: 1, Skipped: 13 >> >>>> >> >>>> >> >>>> [INFO] >> >>>> ------------------------------------------------------------------- >> >>>> ----- >> >>>> >> >>>> [INFO] Reactor Summary: >> >>>> >> >>>> [INFO] >> >>>> >> >>>> [INFO] Apache CloudStack Developer Tools - Checkstyle Configuration >> >>>> SUCCESS [ 1.259 s] >> >>>> >> >>>> [INFO] Apache CloudStack .................................. SUCCESS >> >>>> [ >> >>>> 1.858 s] >> >>>> >> >>>> [INFO] Apache CloudStack Maven Conventions Parent ......... SUCCESS >> >>>> [ >> >>>> 1.528 s] >> >>>> >> >>>> [INFO] Apache CloudStack Framework - Managed Context ...... SUCCESS >> >>>> [ >> >>>> 4.882 s] >> >>>> >> >>>> [INFO] Apache CloudStack Utils ............................ FAILURE >> >>> [01:20 >> >>>> min]?? >> >>>> >> >>>> >> >>>> Thanks, >> >>>> >> >>>> Mike >> >>> >> >> >> >> DISCLAIMER >> ========== >> This e-mail may contain privileged and confidential information which is >> the property of Accelerite, a Persistent Systems business. It is intended >> only for the use of the individual or entity to which it is addressed. If >> you are not the intended recipient, you are not authorized to read, retain, >> copy, print, distribute or use this message. If you have received this >> communication in error, please notify the sender and delete all copies of >> this message. Accelerite, a Persistent Systems business does not accept any >> liability for virus infected mails. >> > > --001a113cd1fa1beb73053291c676--