activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Davies <rajdav...@gmail.com>
Subject Re: ActiveMQ crashes frequently
Date Wed, 29 May 2013 07:16:51 GMT
Ultimately I'm pretty confident this problem is an NFS problem  - and as Johan has already
let the cat out of the bag ;) - let me ask the following:

 Which version of NFS 4 are you using and which environment?
 Have you checked the system logs for NFS errors on all the machines running ActiveMQ brokers
?

thanks,

Rob

On 29 May 2013, at 00:46, Christian Posta <christian.posta@gmail.com> wrote:

> I can make two recommendations.
> 
> #1, being the preferred, create a test case that shows this... that will
> give us the best chance of finding out what's going on... take a look at
> the following test cases in the activemq source code to give you an idea
> about how to go about doing it...
> 
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/usecases/
> 
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/bugs/
> 
> http://svn.apache.org/viewvc/activemq/trunk/activemq-unit-tests/src/test/java/org/apache/activemq/test/JmsTopicSendReceiveTest.java?view=markup
> 
> 
> #2, if creating a test case doesn't sound like something you want to get
> into.. i guess, give us the exact configs of broker, clients, number of
> consumers, number of topics, message sizes, etc, etc all details and if one
> of us gets the urge we can try it out on our boxes. this will not be nearly
> as good as #1, and will provide a higher barrier to entry because we spend
> our spare time doing this and like to spend that time debugging and fixing,
> and not setting up environments and usecases which may not even show a bug
> :)
> 
> 
> 
> 
> On Tue, May 28, 2013 at 4:34 PM, fenbers <Mark.Fenbers@noaa.gov> wrote:
> 
>> 
>> 
>> 
>> 
>> 
>>    I'm getting the Sync exception on both, local and NFS.&nbsp;
>> Originally,
>>    I was only using a local disk, but there wasn't much disk space for
>>    the ever growing list of 33MB enumerated .log files that weren't
>>    cleaned up.&nbsp; So I reconfigured ActiveMQ to put these db files on
>> an
>>    NFS mount.&nbsp; But the sync exceptions occurred either way.
>> 
>>    I've changed *all* my consumers to AUTO_ACKNOWLEDGE, thinking that
>>    maybe an ACKNOWLEDGEment leak was causing the undeleted files.&nbsp;
>> That
>>    didn't help...&nbsp; The TRACE level logging points to only two of my 5
>>    topics that accumulate these undeleted db files.&nbsp; So I've
>>    concentrated by scrutiny over consumers of these two topics.&nbsp; But
>>    have not found anything out of the ordinary.&nbsp;
>> 
>>    What is puzzling me still, is that the frequency of the log file
>>    build-up and the frequency of exceptions continues to increase even
>>    though the amount of messages sent per day by the producers remains
>>    nearly constant...
>>    Mark
>> 
>>    On 5/28/2013 6:06 PM, ceposta [via
>>      ActiveMQ] wrote:
>> 
>>     Sounds like there's multiple issues...
>> 
>>      You're journal files aren't being cleaned up, AND you're getting
>>      the Sync
>> 
>>      exception?
>> 
>>      You get the sync exception on local disk mount? Or just NFS?
>> 
>> 
>>      If the journals aren't being cleaned up, are your consumers
>>      properly
>> 
>>      ack'ing messages?
>> 
>> 
>> 
>>      On Tue, May 28, 2013 at 2:42 PM, fenbers &lt; [hidden email] &gt;
>>      wrote:
>> 
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; I would LOVE to help you help me!&amp;nbsp;
But
>> I have
>>        no idea how to go
>> 
>>        &gt; &nbsp; &nbsp; about making a test case.&amp;nbsp; If
you
>> could drop
>>        some hints in this
>> 
>>        &gt; &nbsp; &nbsp; regard, I might be able to produce one.
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; My ActiveMQ issues seem to be related to network
>>        slowness, which we
>> 
>>        &gt; &nbsp; &nbsp; are diagnosing separately.&amp;nbsp; Or
maybe
>> it is the
>>        other way around,
>> 
>>        &gt; &nbsp; &nbsp; where ActiveMQ problems are causing network
>>        sluggishness.&amp;nbsp; Either
>> 
>>        &gt; &nbsp; &nbsp; way, there seems to be a correlation, except
>> that when
>>        network
>> 
>>        &gt; &nbsp; &nbsp; responsiveness improves, ActiveMQ does not.
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; The problem I'm having with AMQ is progressive,
>> which
>>        is even more
>> 
>>        &gt; &nbsp; &nbsp; puzzling, because we are not adding to the
>> number of
>>        messages that
>> 
>>        &gt; &nbsp; &nbsp; AMQ has to handle.&amp;nbsp; Today, we
were up
>> to 191
>>        undeleted db-NNN.log
>> 
>>        &gt; &nbsp; &nbsp; files in the database directory before I
>> stopped AMQ
>>        and deleted
>> 
>>        &gt; &nbsp; &nbsp; them.&amp;nbsp;&amp;nbsp; NNN was up
to 451, so
>> 260
>>        files had been cleaned up
>> 
>>        &gt; by AMQ's
>> 
>>        &gt; &nbsp; &nbsp; automatic processes...
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; Will log files assist you in helping
>> me?&amp;nbsp; I
>>        have TRACE level
>> 
>>        &gt; &nbsp; &nbsp; messages turned on, so they are quite large.
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; Mark
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; On 5/28/2013 5:22 PM, rajdavies [via
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; ActiveMQ] wrote:
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp;Hi Mark,
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; could you produce a test case for
your
>> problem - it
>>        would help us
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; identify the problem a lot quicker
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; thanks,
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; Rob
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; On 30 Apr 2013, at 16:40, fenbers
>> &amp;lt; [hidden
>>        email] &amp;gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; wrote:
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; Zagan wrote
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;&amp;gt;
Can you please
>> check if your .log
>>        files in the /data
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; directory are cleaned
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;&amp;gt;
up? On basis of
>> the information I
>>        suppose this
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; behaviour is due to a
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;&amp;gt;
misconfiguration
>> of your clients.
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;&amp;gt;
If this is the
>> case often broken
>>        log file cleanup is a
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; symptom.
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; I get the
same error as
>> brought up in this
>>        thread (KahaDB
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; failed to store to
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; Journal).
&amp;nbsp;And
>> yes, I also have a
>>        problem with the
>> 
>>        &gt; numbered
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; .log files not
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; all getting
cleaned up
>> (most files are
>>        removed
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; appropriately). &amp;nbsp;I
have
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; suspected
a client
>> configuration problem
>>        for a long time,
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; but can't figure
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; out what's
wrong -- even
>> with TRACE
>>        logging turned on.
>> 
>>        &gt; &amp;nbsp;In
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; the meantime, I
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; have to
cope with
>> ActiveMQ crashing (i.e.,
>>        shutting itself
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; down) about every
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; two days.
&amp;nbsp;The
>> logs point to a
>>        disk storage problem, but
>> 
>>        &gt; I
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; have plenty of
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; space, so
that's not the
>> issue!
>>        &amp;nbsp;I've tried a couple of
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; different Linux
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; boxes and
both local and
>> NFS mounts, and
>>        this issue occurs
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; on both of them.
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; I'm at a
loss!!
>> &amp;nbsp;I'm running
>>        5.8.0...
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; Mark
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; --
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; View this
message in
>> context:
>> 
>>        &gt;
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4666469.html
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &amp;gt; Sent from
the ActiveMQ -
>> User mailing list
>>        archive at
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; Nabble.com.
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; If you reply to this
email, your
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; message will
be added to
>> the discussion below:
>> 
>>        &gt;
>> 
>>        &gt;
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667572.html
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; To unsubscribe from ActiveMQ
>> crashes frequently,
>>        click
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; here .
>> 
>>        &gt; &nbsp; &nbsp; &nbsp; &nbsp; NAML
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt; mark_fenbers.vcf (360 bytes) &lt;
>> 
>>        &gt;
>> http://activemq.2283324.n4.nabble.com/attachment/4667574/0/mark_fenbers.vcf
>>        &gt; &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt;
>> 
>>        &gt; --
>> 
>>        &gt; View this message in context:
>> 
>>        &gt;
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667574.html
>>        &gt; Sent from the ActiveMQ - User mailing list archive at
>>        Nabble.com.
>> 
>>        &gt;
>> 
>> 
>> 
>> 
>>      --
>>      *Christian Posta*
>> 
>>      http://www.christianposta.com/blog
>>      twitter: @christianposta
>> 
>>       http://www.christianposta.com/blog
>> 
>> 
>> 
>> 
>> 
>>        If you reply to this email, your
>>          message will be added to the discussion below:
>> 
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667575.html
>> 
>> 
>>        To unsubscribe from ActiveMQ crashes frequently, click
>>          here .
>>        NAML
>> 
>> 
>> 
>> 
>> 
>> 
>> mark_fenbers.vcf (360 bytes) <
>> http://activemq.2283324.n4.nabble.com/attachment/4667583/0/mark_fenbers.vcf
>>> 
>> 
>> 
>> 
>> 
>> --
>> View this message in context:
>> http://activemq.2283324.n4.nabble.com/ActiveMQ-crashes-frequently-tp4305407p4667583.html
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>> 
> 
> 
> 
> -- 
> *Christian Posta*
> http://www.christianposta.com/blog
> twitter: @christianposta


Mime
View raw message