qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Broadstone <mbroa...@gmail.com>
Subject Re: spurious ABRT in qpidd
Date Mon, 11 Jan 2016 20:47:47 GMT
On Mon, Jan 11, 2016 at 1:18 PM, Matt Broadstone <mbroadst@gmail.com> wrote:

> On Mon, Jan 11, 2016 at 1:15 PM, Gordon Sim <gsim@redhat.com> wrote:
>
>> On 01/11/2016 05:37 PM, Matt Broadstone wrote:
>>
>>> I'm having trouble tracking down the root cause of a thrown SIGABRT in
>>> qpidd, and was wondering for some advice from the list.  Specifically, it
>>> seems to be after a period of little to no activity, a large burst of
>>> traffic hits the broker and the only information we're seeing in the logs
>>> is:
>>>
>>> Jan 11 15:58:33 test-box kernel: [  652.903997] init: qpidd main process
>>> (2239) killed by ABRT signal
>>> Jan 11 15:58:33 test-box kernel: [  652.911661] init: qpidd main process
>>> ended, respawning
>>>
>>> We're running ubuntu 14.04 (trusty) on this machine, with the packages
>>> off
>>> the official qpid PPA. I tried running the services with trace logging
>>> enabled to no avail (there were no strange packets, and no error messages
>>> about bad assertions). Attaching gdb to the process also resulted in no
>>> relevant information, so I'm running out of ideas of what to try next.
>>> AFAICT the only `abort()` present in the codebase is in the assertion
>>> code,
>>> which would print something about around the assertion failure.
>>>
>>> Any thoughts on what I might try to help resolve this issue?
>>>
>>
>> Could it be a memory issue? I.e. the qpidd processes exceeding some
>> memory limit and being killed by the oom killer?
>>
>>
> I thought so at first too, but I believe we would see a kernel message
> related to that if that were the case, not to mention that the server had
> something like 120GB of free RAM at the time as well. I'm quite willing to
> test that theory more thoroughly if you have a recommended means of doing
> so?
>
>
I wish I could tell you that I have a quick reproducible test case that
only used qpid code, unfortunately we don't it's always in concert with a
number of other services. Is it possible that this could be triggered from
some proton code and that no message would be output? We're actually
experiencing this problem on a loop on one of our servers currently and
there's something like 452GB free memory, so I'm inclined to rule out
memory as the root cause.



> Matt
>
>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
>> For additional commands, e-mail: users-help@qpid.apache.org
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message