Return-Path: X-Original-To: apmail-qpid-users-archive@www.apache.org Delivered-To: apmail-qpid-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 23A396B2F for ; Thu, 9 Jun 2011 13:19:09 +0000 (UTC) Received: (qmail 11287 invoked by uid 500); 9 Jun 2011 13:19:08 -0000 Delivered-To: apmail-qpid-users-archive@qpid.apache.org Received: (qmail 11266 invoked by uid 500); 9 Jun 2011 13:19:08 -0000 Mailing-List: contact users-help@qpid.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@qpid.apache.org Delivered-To: mailing list users@qpid.apache.org Received: (qmail 11257 invoked by uid 99); 9 Jun 2011 13:19:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jun 2011 13:19:08 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of gsim@redhat.com designates 209.132.183.28 as permitted sender) Received: from [209.132.183.28] (HELO mx1.redhat.com) (209.132.183.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jun 2011 13:19:02 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p59DIg23027896 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 9 Jun 2011 09:18:42 -0400 Received: from [10.3.237.38] (vpn-237-38.phx2.redhat.com [10.3.237.38]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p59DIfjN024666 for ; Thu, 9 Jun 2011 09:18:41 -0400 Message-ID: <4DF0C7E3.1010104@redhat.com> Date: Thu, 09 Jun 2011 14:17:23 +0100 From: Gordon Sim Organization: Red Hat UK Ltd, Registered in England and Wales under Company Registration No. 3798903, Directors: Michael Cunningham (USA), Brendan Lane (Ireland), Matt Parsons (USA), Charlie Peters (USA) User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.15) Gecko/20101027 Fedora/3.0.10-1.fc12 Lightning/1.0b1 Thunderbird/3.0.10 MIME-Version: 1.0 To: users@qpid.apache.org Subject: Re: C Broker Availability Problem References: <4DEF6637.4010603@raytheon.com> In-Reply-To: <4DEF6637.4010603@raytheon.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 On 06/08/2011 01:08 PM, Richard Peter wrote: > Hi, > > The issue I'm having is when a client producer sends message based on > user interaction. The message causes a screen to pop up on another > workstation. Usually the pop up is instantaneous, sometimes though it > takes up to 2 minutes for the message to get to the other workstation. Have you noticed latencies that large for any other messages in the system? What's the max queue depth on the queue that message travels through? Is it usually empty? > The message is a JMS text message containing 9 characters, so fairly > small message. We have tried tuning the worker-threads thinking it was > an availability issue. This single message is more important than all > the other traffic our qpid is handling. Is there a way to give priority > to one queue over another? There is a large amount of traffic being > handled by the broker, What is your estimated peak total throughput? > but not sure how the design is setup to handle > when they are many more sessions/queues than worker-threads. Does a > thread send all messages to a consumer before moving on to the next > queue? Or is the only way to ensure availability to further increase > worker-threads? I've had the threads as high as 100, but the load on the > system made the problem worse. Our setup is below. > > We are using version 0.8 of the C broker and java client. The broker has > roughly 100 queues. Each queue has at least two consumers, 1 each from > separate servers in a cluster. We then also have 20 clients listens to 4 > topics and 5 clients listening to 1 queue (the important one mentioned > above). So in general out broker has roughly 300 sessions open at any > given time. Is each session on its own connection? Or are connections shared? If shared, how many connections are there? > Almost all of the queues are durable. The topics are not > durable, nor are subscribers durable. All but one clients in the > scenario are java clients, with 1 c client. The servers also use the > java client. The following is connection url used by most of the clients > (its embedded in spring xml, thus the escaped &. > > amqp://guest:guest@/program?brokerlist='tcp://${broker.addr}?retries='0'&tcp_nodelay='true'&connecttimeout='5000''&maxprefetch='0'&sync_publish='all'&failover='nofailover' > > > I only recently turned on tcp_nodelay and sync_publish, thinking that > perhaps the message was occasionally getting stuck. These are the > setting from our conf file for the broker: > > auth=no > worker-threads=50 > data-dir=/somepath/qpid/data > store-dir=/somepath/qpid/messageStore > pid-dir=/somepath/qpid/var/lock > num-jfiles=16 > jfile-size-pgs=24 > tcp-nodelay=true > > Many of the queues are sized larger than the default through a queue > creator script. The sizes range up to a max file count of 32 and file > size of 48. The server running qpid is a 8 cpu system with 2g of memory, > some of the offices have a 16 cpu system with 8g of memory. The server > size does not make a difference in the errors. > > Part of the theory for availability being the issue was that the clients > kept timing out on heartbeat. So we disabled the heartbeat. We also > occasionally see > INFO 2011-06-06 17:47:42,501 [IoReceiver - somemachine/someip:5672] > JmsPooledSession: EDEX: DEFAULT - Failed to close session > org.apache.qpid.transport.SessionException: timed out waiting for sync: > complete = 30115, point = 30116 > at org.apache.qpid.transport.Session.sync(Session.java:744) > at org.apache.qpid.transport.Session.sync(Session.java:713) > at > org.apache.qpid.client.AMQSession_0_10.sendClose(AMQSession_0_10.java:427) > at org.apache.qpid.client.AMQSession.close(AMQSession.java:700) > at org.apache.qpid.client.AMQSession.close(AMQSession.java:666) > at org.apache.qpid.client.AMQSession.close(AMQSession.java:525) > at > somepackage.jms.JmsPooledSession.closeInternal(JmsPooledSession.java:164) > at > somepackage.jms.JmsPooledConnection.disconnect(JmsPooledConnection.java:152) > > at > somepackage.jms.JmsPooledConnection.onException(JmsPooledConnection.java:127) > > at > org.apache.qpid.client.AMQConnectionDelegate_0_10.closed(AMQConnectionDelegate_0_10.java:270) > > at org.apache.qpid.transport.Connection.closed(Connection.java:529) > at org.apache.qpid.transport.network.Assembler.closed(Assembler.java:113) > at > org.apache.qpid.transport.network.InputHandler.closed(InputHandler.java:202) > > at org.apache.qpid.transport.network.io.IoReceiver.run(IoReceiver.java:150) > at java.lang.Thread.run(Thread.java:619) > > The gap between complete and point used to be much larger before adding > the sync_publish setting. There are no errors in the qpid broker log. That looks like it might be https://issues.apache.org/jira/browse/QPID-3259, though I would expect some error in the broker log as well. > The only thing in the log is along the lines of the following 2 messages: > > qpidd[19149]: 2011-06-08 11:50:03 warning > ManagementAgent::periodicProcessing task overran 1 times by 6ms (taking > 5098421ns) on average. > qpidd[19149]: 2011-06-08 11:50:16 warning task overran 3 times by 2ms > (taking 27955ns) on average. > > Thanks, > Richard Peter > > --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:users-subscribe@qpid.apache.org