activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Bain <tb...@alumni.duke.edu>
Subject Re: Both instances of ActiveMQ connected to kahadb after network outage
Date Thu, 05 Apr 2018 03:58:53 GMT
On Wed, Apr 4, 2018 at 9:18 AM, gbrown <gbrown@mediaocean.com> wrote:

> We had a short outage on the network and once the this came back both
> instances in our master / slave setup were up and connectable. Once this
> was
> discovered when messages on queues were not browsable or able to be
> consumed
> the instances were restarted after renaming the db.data file as other
> methods to start (persistenceAdapter options) would not work.
>
> Once started the messages on the queues were gone so probably lost.
>
> We use an nfs4 mount point.
>
> ActiveMQ Version is 5.11.1
>
> so can anyone help with
>
> 1. How is it possible that both master and slave connected to the kahabd
>


It sure sounds like your NFS setup isn't successfully doing shared
exclusive locks, even though it's an NFSv4 mount.
http://activemq.2283324.n4.nabble.com/Unreliable-NFS-exclusive-locks-on-unreliable-networks-td4737992.html
has some discussion of the NFS mount options that some other users are
using, but I can't say that anyone's built a consensus around "these
settings work and these other ones don't" so all you have to go on at the
moment are these reports from other users. If you're able to tell us what
settings you end up using that fix the problem (and you should plan on
doing thorough testing, given that you've just demonstrated that your
current settings appeared to work but didn't actually), maybe we can
establish enough of a consensus among the community to consider documenting
recommended values on the wiki.



> 2. Is there anyway I could have recovered that would have kept the messages
> on the queues
>


db.data is the index, and is simply cached information derived from the
actual journal files. It can be safely deleted without data loss, because
it will simply be rebuilt from the journal files. If all you deleted was
that one file (which is what it sounds like) and you ended up not having
messages upon restart, it means they had already been deleted from the
journal files, and there wasn't anything you could have done to avoid
losing the messages. If on the other hand you deleted *.log files in
addition to db.data, then you could have avoided losing your messages by
not deleting those journal files (*.log). I think from what you wrote that
the message loss was unavoidable, unless your description of which files
you deleted was incomplete.

Tim

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message