Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id AD978200BAF for ; Mon, 17 Oct 2016 00:30:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id AC2CB160AF8; Sun, 16 Oct 2016 22:30:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F061C160AD0 for ; Mon, 17 Oct 2016 00:30:21 +0200 (CEST) Received: (qmail 12457 invoked by uid 500); 16 Oct 2016 22:30:21 -0000 Mailing-List: contact users-help@qpid.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@qpid.apache.org Delivered-To: mailing list users@qpid.apache.org Received: (qmail 12433 invoked by uid 99); 16 Oct 2016 22:30:20 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Oct 2016 22:30:20 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 623E7C1421 for ; Sun, 16 Oct 2016 22:30:20 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.679 X-Spam-Level: * X-Spam-Status: No, score=1.679 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id WRzNgIVS722h for ; Sun, 16 Oct 2016 22:30:19 +0000 (UTC) Received: from mail-qk0-f170.google.com (mail-qk0-f170.google.com [209.85.220.170]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 2CB705FBFC for ; Sun, 16 Oct 2016 22:30:18 +0000 (UTC) Received: by mail-qk0-f170.google.com with SMTP id z190so206338423qkc.2 for ; Sun, 16 Oct 2016 15:30:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=gJVZCvL8EBO30BggLv9QmvT/BpqWy9z5Wu8GyCSQO6c=; b=EmAzFLeuwPQpGbQ2ypXxklshfEkMe6EIVG5M8KdaCIyuQtXjcmA89dKVYYyOMx9AL2 qwSOmUXXy98xrRqaLnpkLauxwytRyT2uGMJgh0hWczsrLYw7cm05AO3j7Y844wk4A+yD 187NXbowUVGH69BkNR7c92GRULN4N628HzYHcWa78B3gbuqXYxf7AdK+q39kxrP+J5Sm ImDUxpyI7RHeII9OcmkYPoxD2yYWPDdh4+aucM3lZnCVE97Wj3up2IcymiRvWln+tOcs o2dffYpjtjrFmRwMmrSXRCm/mEnBY+UEmUScUmluV5xIGqZRideUpTgIGTbVBDyETrEt F7uQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=gJVZCvL8EBO30BggLv9QmvT/BpqWy9z5Wu8GyCSQO6c=; b=BiE72qADAqkcJT0DhwrDQJY0w5njP4JEJtdnAgLxfaAT94XgZi20e/jgEwA7M9PzoL gTwRY9fq83CJcI8xUNiGxrlP71BSwN8YEYcPILRp4G0yDPD1zhNnZrP3E+E0QdPHaGJ8 FL13pH//mBIhsCFVNS0XDYoRX7YmKj3yYOj68dl3LltzgtjkXz/AY3WmmebWIWJO65YK 7N2ilnZWSqxe6px8L299eZU2sB7tTmKdlhbik956zHixJRnSK+S2YFyNFBb/MNVFcEpg xL/Hczs08kSr9AbmAX1BaZwhSkVNtLLeHfRDEI5mltQmXeBrVIWxlHNVqaU6Oedduv1s zm1g== X-Gm-Message-State: AA6/9RlBggyek44u/QElO12BPXhLsMPWyUe2fQLd6M7HKzMtl6M+wigRARLdoCTixkHuq6vVth6EAt/z1k+kYA== X-Received: by 10.55.143.129 with SMTP id r123mr20275940qkd.39.1476657017243; Sun, 16 Oct 2016 15:30:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.12.141.11 with HTTP; Sun, 16 Oct 2016 15:30:16 -0700 (PDT) In-Reply-To: References: From: Rob Godfrey Date: Sun, 16 Oct 2016 23:30:16 +0100 Message-ID: Subject: Re: Qpid broker 6.0.4 performance issues To: "users@qpid.apache.org" Cc: Helen Kwong Content-Type: multipart/alternative; boundary=94eb2c083bc0f7cdf6053f030018 archived-at: Sun, 16 Oct 2016 22:30:22 -0000 --94eb2c083bc0f7cdf6053f030018 Content-Type: text/plain; charset=UTF-8 OK - so having pondered / hacked around a bit this weekend, I think to get decent performance from the IO model in 6.0 for your use case we're going to have to change things around a bit. Basically 6.0 is an intermediate step on our IO / threading model journey. In earlier versions we used 2 threads per connection for IO (one read, one write) and then extra threads from a pool to "push" messages from queues to connections. In 6.0 we move to using a pool for the IO threads, and also stopped queues from "pushing" to connections while the IO threads were acting on the connection. It's this latter fact which is screwing up performance for your use case here because what happens is that on each network read we tell each consumer to stop accepting pushes from the queue until the IO interaction has completed. This is causing lots of loops over your 3000 consumers on each session, which is eating up a lot of CPU on every network interaction. In the final version of our IO refactoring we want to remove the "pushing" from the queue, and instead have the consumers "pull" - so that the only threads that operate on the queues (outside of housekeeping tasks like expiry) will be the IO threads. So, what we could do (and I have a patch sitting on my laptop for this) is to look at using the "multi queue consumers" work I did for you guys before, but augmenting this so that the consumers work using a "pull" model rather than the push model. This will guarantee strict fairness between the queues associated with the consumer (which was the issue you had with this functionality before, I believe). Using this model you'd only need a small number (one?) of consumers per session. The patch I have is to add this "pull" mode for these consumers (essentially this is a preview of how all consumers will work in the future). Does this seem like something you would be interested in pursuing? Cheers, Rob On 15 October 2016 at 17:30, Ramayan Tiwari wrote: > Thanks Rob. Apologies for sending this over weekend :( > > Are there are docs on the new threading model? I found this on confluence: > > https://cwiki.apache.org/confluence/display/qpid/IO+Transport+Refactoring > > We are also interested in understanding the threading model a little better > to help us figure our its impact for our usage patterns. Would be very > helpful if there are more docs/JIRA/email-threads with some details. > > Thanks > > On Sat, Oct 15, 2016 at 9:21 AM, Rob Godfrey > wrote: > > > So I *think* this is an issue because of the extremely large number of > > consumers. The threading model in v6 means that whenever a network read > > occurs for a connection, it iterates over the consumers on that > connection > > - obviously where there are a large number of consumers this is > > burdensome. I fear addressing this may not be a trivial change... I > shall > > spend the rest of my afternoon pondering this... > > > > - Rob > > > > On 15 October 2016 at 17:14, Ramayan Tiwari > > wrote: > > > > > Hi Rob, > > > > > > Thanks so much for your response. We use transacted sessions with > > > non-persistent delivery. Prefetch size is 1 and every message is same > > size > > > (200 bytes). > > > > > > Thanks > > > Ramayan > > > > > > On Sat, Oct 15, 2016 at 2:59 AM, Rob Godfrey > > > wrote: > > > > > > > Hi Ramyan, > > > > > > > > this is interesting... in our testing (which admittedly didn't cover > > the > > > > case of this many queues / listeners) we saw the 6.0.x broker using > > less > > > > CPU on average than the 0.32 broker. I'll have a look this weekend > as > > to > > > > why creating the listeners is slower. On the dequeing, can you give > a > > > > little more information on the usage pattern - are you using > > > transactions, > > > > auto-ack or client ack? What prefetch size are you using? How large > > are > > > > your messages? > > > > > > > > Thanks, > > > > Rob > > > > > > > > On 14 October 2016 at 23:46, Ramayan Tiwari < > ramayan.tiwari@gmail.com> > > > > wrote: > > > > > > > > > Hi All, > > > > > > > > > > We have been validating the new Qpid broker (version 6.0.4) and > have > > > > > compared against broker version 0.32 and are seeing major > > regressions. > > > > > Following is the summary of our test setup and results: > > > > > > > > > > *1. Test Setup * > > > > > *a). *Qpid broker runs on a dedicated host (12 cores, 32 GB RAM). > > > > > *b).* For 0.32, we allocated 16 GB heap. For 6.0.6 broker, we use > > 8GB > > > > > heap and 8GB direct memory. > > > > > *c).* For 6.0.4, flow to disk has been configured at 60%. > > > > > *d).* Both the brokers use BDB host type. > > > > > *e).* Brokers have around 6000 queues and we create 16 listener > > > > > sessions/threads spread over 3 connections, where each session is > > > > listening > > > > > to 3000 queues. However, messages are only enqueued and processed > > from > > > 10 > > > > > queues. > > > > > *f).* We enqueue 1 million messages across 10 different queues > > > (evenly > > > > > divided), at the start of the test. Dequeue only starts once all > the > > > > > messages have been enqueued. We run the test for 2 hours and > process > > as > > > > > many messages as we can. Each message runs for around 200 > > milliseconds. > > > > > *g).* We have used both 0.16 and 6.0.4 clients for these tests > > (6.0.4 > > > > > client only with 6.0.4 broker) > > > > > > > > > > *2. Test Results * > > > > > *a).* System Load Average (read notes below on how we compute > it), > > > for > > > > > 6.0.4 broker is 5x compared to 0.32 broker. During start of the > test > > > > (when > > > > > we are not doing any dequeue), load average is normal (0.05 for > 0.32 > > > > broker > > > > > and 0.1 for new broker), however, while we are dequeuing messages, > > the > > > > load > > > > > average is very high (around 0.5 consistently). > > > > > > > > > > *b). *Time to create listeners in new broker has gone up by 220% > > > > compared > > > > > to 0.32 broker (when using 0.16 client). For old broker, creating > 16 > > > > > sessions each listening to 3000 queues takes 142 seconds and in new > > > > broker > > > > > it took 456 seconds. If we use 6.0.4 client, it took even longer at > > > 524% > > > > > increase (887 seconds). > > > > > *I).* The time to create consumers increases as we create more > > > > > listeners on the same connections. We have 20 sessions (but end up > > > using > > > > > around 5 of them) on each connection and we create about 3000 > > consumers > > > > and > > > > > attach MessageListener to it. Each successive session takes longer > > > > > (approximately linear increase) to setup same number of consumers > and > > > > > listeners. > > > > > > > > > > *3). How we compute System Load Average * > > > > > We query the Mbean SysetmLoadAverage and divide it by the value of > > > MBean > > > > > AvailableProcessors. Both of these MBeans are available under > > > > > java.lang.OperatingSystem. > > > > > > > > > > I am not sure what is causing these regressions and would like your > > > help > > > > in > > > > > understanding it. We are aware about the changes with respect to > > > > threading > > > > > model in the new broker, are there any design docs that we can > refer > > to > > > > > understand these changes at a high level? Can we tune some > parameters > > > to > > > > > address these issues? > > > > > > > > > > Thanks > > > > > Ramayan > > > > > > > > > > > > > > > --94eb2c083bc0f7cdf6053f030018--