qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charles Woerner <cwoer...@demandbase.com>
Subject Re: c++ broker startup problem with large journal containing lots of data
Date Thu, 11 Mar 2010 21:13:27 GMT
Thanks Kim,

1) I'm using qpid from RedHat MRG 0.5.752581-34 for i386 and the store  
module is RHM 0.5.3206-27, also for i386.

2) Yes, my messages are persistent.  I am aware that transient  
messages are discarded on recovery.  Your comment that durable  
messages "have their message content discarded when the policy limit  
is reached" is surprising.  I have an 80 GB store and a 1 GB queue  
with flow-to-disk.  My expectation was that I would be able to  
continue to enqueue durable messages past the 1GB mark (up to 80% of  
80 GB) and only the first (or latest, not sure) 1GB of queue data  
would be held in memory.  I expected the dequeues would sort of  
"backfill" the 1GB memory buffer until the "flowed-to-disk" messages  
were consumed from the store.  I recognize that that this is sort of  
dual-purposing the backing store to do both journaling and spillover,  
but that doesn't seem too far fetched.  I'm beginning to think this is  
incorrect.  Is there some kind of identity relating available memory  
to store geometry to queue size (ie. is the following a good rule of  
thumb "available memory" = "store size" = "sum(max-queue-size of all  
queues)")?

Also, in an attempt to (slowly) recover the queue I allocated 2 GB  
swap.  This staved off the "Cannot read from child process" error with  
a c++ bad_alloc exception which I could see as a debug statement in  
the log.  However, 2 hours later the system was still in the same  
unresponsive state (seemingly trying to complete recovery).  It had  
not yet begun listening on 5672.  I captured the lsof for qpidd user  
in case that might also be helpful.  Yes, the sfbroker.log file is big  
and no, truncating it doesn't alleviate the problem :)

~]# lsof -u qpidd
COMMAND   PID  USER   FD   TYPE     DEVICE       SIZE      NODE NAME
qpidd   12365 qpidd  cwd    DIR        8,1       4096    573441 /root
qpidd   12365 qpidd  rtd    DIR        8,1       4096         2 /
qpidd   12365 qpidd  txt    REG        8,1      76165   1299146 /usr/ 
sbin/qpidd
qpidd   12365 qpidd  mem    REG        8,1   18031759   1215641 /usr/ 
lib/libqpidcommon.so.0.1.0
qpidd   12365 qpidd  mem    REG        8,1      14644    786449 /lib/ 
libdl-2.5.so
qpidd   12365 qpidd  mem    REG        8,1      97512   1214114 /usr/ 
lib/libsasl2.so.2.0.22
qpidd   12365 qpidd  mem    REG        8,1     206396    786499 /lib/ 
i686/nosegneg/libm-2.5.so
qpidd   12365 qpidd  mem    REG        8,1      44992    786703 /lib/ 
libgcc_s-4.1.2-20080825.so.1
qpidd   12365 qpidd  mem    REG        8,1    1620400    789604 /lib/ 
i686/nosegneg/libc-2.5.so
qpidd   12365 qpidd  mem    REG        8,1     127720    786500 /lib/ 
i686/nosegneg/libpthread-2.5.so
qpidd   12365 qpidd  mem    REG        8,1      74616    786483 /lib/ 
libresolv-2.5.so
qpidd   12365 qpidd  mem    REG        8,1      43544    786447 /lib/ 
libcrypt-2.5.so
qpidd   12365 qpidd  mem    REG        8,1       2632   1213919 /usr/ 
lib/libaio.so.1.0.1
qpidd   12365 qpidd  mem    REG        8,1      15584   1247077 /usr/ 
lib/openais/libcpg.so.2.0.0
qpidd   12365 qpidd  mem    REG        8,1      14144    786738 /lib/ 
libuuid.so.1.2
qpidd   12365 qpidd  mem    REG        8,1      16068   1215646 /usr/ 
lib/libcman.so.2.0.115
qpidd   12365 qpidd  mem    REG        8,1      42068    786501 /lib/ 
i686/nosegneg/librt-2.5.so
qpidd   12365 qpidd  mem    REG        8,1      62328   1215614 /usr/ 
lib/libboost_filesystem.so.1.33.1
qpidd   12365 qpidd  mem    REG        8,1     124432    786503 /lib/ 
ld-2.5.so
qpidd   12365 qpidd  mem    REG        8,1     625546   1266311 /usr/ 
lib/qpid/daemon/replicating_listener.so
qpidd   12365 qpidd  mem    REG        8,1     213760   1215620 /usr/ 
lib/libboost_program_options.so.1.33.1
qpidd   12365 qpidd  mem    REG        8,1    1127841   1266315 /usr/ 
lib/qpid/daemon/msgstore.so
qpidd   12365 qpidd  mem    REG        8,1     946263   1215643 /usr/ 
lib/libsslcommon.so.0.1.0
qpidd   12365 qpidd  mem    REG        8,1      96924   1214052 /usr/ 
lib/libnssutil3.so
qpidd   12365 qpidd  mem    REG        8,1     925520   1214126 /usr/ 
lib/libstdc++.so.6.0.8
qpidd   12365 qpidd  mem    REG        8,1    1109436   1213951 /usr/ 
lib/libdb_cxx-4.3.so
qpidd   12365 qpidd  mem    REG        8,1     203340   1214124 /usr/ 
lib/libssl3.so
qpidd   12365 qpidd  mem    REG        8,1     839781   1266317 /usr/ 
lib/qpid/daemon/xml.so
qpidd   12365 qpidd  mem    REG        8,1   16636786   1215692 /usr/ 
lib/libqpidbroker.so.0.1.0
qpidd   12365 qpidd  mem    REG        8,1     226512   1214049 /usr/ 
lib/libnspr4.so
qpidd   12365 qpidd  mem    REG        8,1    6270148   1266314 /usr/ 
lib/qpid/daemon/cluster.so
qpidd   12365 qpidd  mem    REG        8,1       9912   1214097 /usr/ 
lib/libplds4.so
qpidd   12365 qpidd  mem    REG        8,1    6782769   1215639 /usr/ 
lib/libqpidclient.so.0.1.0
qpidd   12365 qpidd  mem    REG        8,1     587779   1261584 /usr/ 
lib/qpid/client/sslconnector.so
qpidd   12365 qpidd  mem    REG        8,1      14008   1214096 /usr/ 
lib/libplc4.so
qpidd   12365 qpidd  mem    REG        8,1     515215   1266312 /usr/ 
lib/qpid/daemon/replication_exchange.so
qpidd   12365 qpidd  mem    REG        8,1     558508   1266316 /usr/ 
lib/qpid/daemon/ssl.so
qpidd   12365 qpidd  mem    REG        8,1    1187124   1214050 /usr/ 
lib/libnss3.so
qpidd   12365 qpidd  mem    REG        8,1    4155068   1215591 /usr/ 
lib/libxerces-c.so.27.0
qpidd   12365 qpidd  mem    REG        8,1    3957632   1215596 /usr/ 
lib/libxqilla.so.3.0.0
qpidd   12365 qpidd    0u   CHR      136,0                    2 /dev/ 
pts/0
qpidd   12365 qpidd    1u   CHR      136,0                    2 /dev/ 
pts/0
qpidd   12365 qpidd    2u   CHR      136,0                    2 /dev/ 
pts/0
qpidd   12365 qpidd    3r  FIFO        0,6               240193 pipe
qpidd   12365 qpidd    4w  FIFO        0,6               240193 pipe
qpidd   12365 qpidd    5w   REG        8,1 3556296254     98581 /var/ 
log/qpid/sfbroker.log
qpidd   12365 qpidd    6r   CHR        1,9                 1065 /dev/ 
urandom
qpidd   12365 qpidd    7r  FIFO        0,6               240195 pipe
qpidd   12366 qpidd  cwd    DIR        8,1       4096    557058 /var/ 
lib/qpidd/.qpidd
qpidd   12366 qpidd  rtd    DIR        8,1       4096         2 /
qpidd   12366 qpidd  txt    REG        8,1      76165   1299146 /usr/ 
sbin/qpidd
qpidd   12366 qpidd  mem    REG        8,1   18031759   1215641 /usr/ 
lib/libqpidcommon.so.0.1.0
qpidd   12366 qpidd  mem    REG        8,1      14644    786449 /lib/ 
libdl-2.5.so
qpidd   12366 qpidd  mem    REG        8,1      97512   1214114 /usr/ 
lib/libsasl2.so.2.0.22
qpidd   12366 qpidd  mem    REG        8,1     206396    786499 /lib/ 
i686/nosegneg/libm-2.5.so
qpidd   12366 qpidd  mem    REG        8,1      44992    786703 /lib/ 
libgcc_s-4.1.2-20080825.so.1
qpidd   12366 qpidd  mem    REG        8,1    1620400    789604 /lib/ 
i686/nosegneg/libc-2.5.so
qpidd   12366 qpidd  mem    REG        8,1     127720    786500 /lib/ 
i686/nosegneg/libpthread-2.5.so
qpidd   12366 qpidd  mem    REG        8,1      74616    786483 /lib/ 
libresolv-2.5.so
qpidd   12366 qpidd  mem    REG        8,1      43544    786447 /lib/ 
libcrypt-2.5.so
qpidd   12366 qpidd  mem    REG        8,1       2632   1213919 /usr/ 
lib/libaio.so.1.0.1
qpidd   12366 qpidd  mem    REG        8,1      15584   1247077 /usr/ 
lib/openais/libcpg.so.2.0.0
qpidd   12366 qpidd  mem    REG        8,1      14144    786738 /lib/ 
libuuid.so.1.2
qpidd   12366 qpidd  mem    REG        8,1      16068   1215646 /usr/ 
lib/libcman.so.2.0.115
qpidd   12366 qpidd  mem    REG        8,1      42068    786501 /lib/ 
i686/nosegneg/librt-2.5.so
qpidd   12366 qpidd  mem    REG        8,1      62328   1215614 /usr/ 
lib/libboost_filesystem.so.1.33.1
qpidd   12366 qpidd  mem    REG        8,1     124432    786503 /lib/ 
ld-2.5.so
qpidd   12366 qpidd  mem    REG        8,1     625546   1266311 /usr/ 
lib/qpid/daemon/replicating_listener.so
qpidd   12366 qpidd  mem    REG        8,1     213760   1215620 /usr/ 
lib/libboost_program_options.so.1.33.1
qpidd   12366 qpidd  mem    REG        8,1    1127841   1266315 /usr/ 
lib/qpid/daemon/msgstore.so
qpidd   12366 qpidd  mem    REG        8,1     946263   1215643 /usr/ 
lib/libsslcommon.so.0.1.0
qpidd   12366 qpidd  mem    REG        8,1      96924   1214052 /usr/ 
lib/libnssutil3.so
qpidd   12366 qpidd  mem    REG        8,1     925520   1214126 /usr/ 
lib/libstdc++.so.6.0.8
qpidd   12366 qpidd  mem    REG        8,1    1109436   1213951 /usr/ 
lib/libdb_cxx-4.3.so
qpidd   12366 qpidd  mem    REG        8,1     203340   1214124 /usr/ 
lib/libssl3.so
qpidd   12366 qpidd  mem    REG        8,1     839781   1266317 /usr/ 
lib/qpid/daemon/xml.so
qpidd   12366 qpidd  mem    REG        8,1   16636786   1215692 /usr/ 
lib/libqpidbroker.so.0.1.0
qpidd   12366 qpidd  mem    REG        8,1     226512   1214049 /usr/ 
lib/libnspr4.so
qpidd   12366 qpidd  mem    REG        8,1    6270148   1266314 /usr/ 
lib/qpid/daemon/cluster.so
qpidd   12366 qpidd  mem    REG        8,1       9912   1214097 /usr/ 
lib/libplds4.so
qpidd   12366 qpidd  mem    REG        8,1    6782769   1215639 /usr/ 
lib/libqpidclient.so.0.1.0
qpidd   12366 qpidd  mem    REG        8,1     587779   1261584 /usr/ 
lib/qpid/client/sslconnector.so
qpidd   12366 qpidd  mem    REG        8,1      14008   1214096 /usr/ 
lib/libplc4.so
qpidd   12366 qpidd  mem    REG        8,1     515215   1266312 /usr/ 
lib/qpid/daemon/replication_exchange.so
qpidd   12366 qpidd  mem    REG        8,1     558508   1266316 /usr/ 
lib/qpid/daemon/ssl.so
qpidd   12366 qpidd  mem    REG        8,1    1187124   1214050 /usr/ 
lib/libnss3.so
qpidd   12366 qpidd  mem    REG        8,1    4155068   1215591 /usr/ 
lib/libxerces-c.so.27.0
qpidd   12366 qpidd  mem    REG        8,1    3957632   1215596 /usr/ 
lib/libxqilla.so.3.0.0
qpidd   12366 qpidd  mem    REG       8,96      16384 402653317 /vol/ 
qpidd/rhm/dat/__db.005
qpidd   12366 qpidd  mem    REG       8,96     450560 402653316 /vol/ 
qpidd/rhm/dat/__db.004
qpidd   12366 qpidd  mem    REG       8,96     294912 402653315 /vol/ 
qpidd/rhm/dat/__db.003
qpidd   12366 qpidd  mem    REG       8,96     278528 402653314 /vol/ 
qpidd/rhm/dat/__db.002
qpidd   12366 qpidd  mem    REG       8,96      24576 402653313 /vol/ 
qpidd/rhm/dat/__db.001
qpidd   12366 qpidd    0u   CHR        1,3                  778 /dev/ 
null
qpidd   12366 qpidd    1u   CHR        1,3                  778 /dev/ 
null
qpidd   12366 qpidd    2u   CHR        1,3                  778 /dev/ 
null
qpidd   12366 qpidd    3r  FIFO        0,6               240193 pipe
qpidd   12366 qpidd    4w  FIFO        0,6               240193 pipe
qpidd   12366 qpidd    5w   REG        8,1 3556296254     98581 /var/ 
log/qpid/sfbroker.log
qpidd   12366 qpidd    6r   CHR        1,9                 1065 /dev/ 
urandom
qpidd   12366 qpidd    7r  0000       0,10          0    240197  
eventpoll
qpidd   12366 qpidd    8w  FIFO        0,6               240195 pipe
qpidd   12366 qpidd    9u  unix 0xed2f5040               240196 socket
qpidd   12366 qpidd   10wW  REG       8,96          0       131 /vol/ 
qpidd/lock
qpidd   12366 qpidd   11r  FIFO        0,6               240198 pipe
qpidd   12366 qpidd   12w  FIFO        0,6               240198 pipe
qpidd   12366 qpidd   13u   REG       8,96       8192 402653319 /vol/ 
qpidd/rhm/dat/queues.db
qpidd   12366 qpidd   14u   REG       8,96       8192 402653320 /vol/ 
qpidd/rhm/dat/config.db
qpidd   12366 qpidd   15u   REG       8,96       8192 402653321 /vol/ 
qpidd/rhm/dat/exchanges.db
qpidd   12366 qpidd   16u   REG       8,96       8192 402653322 /vol/ 
qpidd/rhm/dat/messages.db
qpidd   12366 qpidd   17u   REG       8,96       8192 402653323 /vol/ 
qpidd/rhm/dat/mappings.db
qpidd   12366 qpidd   18u   REG       8,96       8192 402653324 /vol/ 
qpidd/rhm/dat/bindings.db
qpidd   12366 qpidd   19u   REG       8,96       8192 402653325 /vol/ 
qpidd/rhm/dat/general.db
qpidd   12366 qpidd   20u   REG       8,96   10485760 402653318 /vol/ 
qpidd/rhm/dat/log.0000000001
qpidd   12366 qpidd   21r   REG       8,96 1342177792 503316610 /vol/ 
qpidd/rhm/jrnl/000c/hits/JournalData.0001.jdat

On Mar 11, 2010, at 12:15 PM, Kim van der Riet wrote:

> A few questions:
>
> 1. What versions of the store and qpid are you using?
>
> 2. Are your messages persistent? Flow-to-disk will force transient
> messages to disk, but will discard them on recovery; persistent  
> messages
> are written to disk anyway, but have their message content discarded
> when the policy limit is reached. I guess the latter is the case here.
>
> This behavior should be checked out... it looks suspicious.
>
> On Thu, 2010-03-11 at 11:01 -0800, Charles Woerner wrote:
>> A bit more information, at about the time Journal recovery phase 1 is
>> complete I see one of the two qpidd processes begin to acquire vast
>> amounts of memory (I assume this is the phase where it begins reading
>> up to max-queue-size of persistent queue data into memory).  Once the
>> process acquires about 97% of available memory the system usually  
>> dies
>> with the "critical Broker start-up failed: Cannot read from child
>> process.".
>>
>> I'm not surprised that the system dies when it runs out of memory.
>> But I hope someone can help me understand why qpidd is using so much
>> memory and whether I have my max-queue-size and store geometry
>> configured in a reasonable manner.  I would hope that there isn't  
>> some
>> inherent limitation where one can't recover more data from a
>> persistent journal than one has available memory.
>>
>> On Mar 10, 2010, at 10:55 PM, Charles Woerner wrote:
>>
>>> To follow up, it always fails at startup with this large queue,  
>>> but I
>>> don't always see the "Broker startup failed: cannot read from child
>>> process" error.  Sometimes it just dies.
>>>
>>
>> __
>>
>> Charles Woerner  | cwoerner@demandbase.com |   demandbase
>
>
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
>

__

Charles Woerner  | cwoerner@demandbase.com |   demandbase   |   
415.683.2669







Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message