activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Burton <bur...@spinn3r.com>
Subject massive data loss after 5.10 + leveldb restart?
Date Thu, 18 Dec 2014 20:45:04 GMT
I’m trying to track down a bunch of concerning production bugs in ActiveMQ
that might be related.

The biggest problem I’m seeing is that on restart, it seems that we’re
losing a LARGE percentage of our messages.

We had about 7k in a queue, and on restart, it went down to almost zero.

I’m getting LevelDB messages like:

2014-12-18 14:29:36,024 | WARN  | No reader available for position:
eac7b1cad, log_infos:
{81371951320=LogInfo(/var/lib/apache-activemq/leveldb/00000012f22570d8.log,81371951320,104858168),
81686526928=LogInfo(/var/lib/apache- … } |
org.apache.activemq.leveldb.RecordLog | ActiveMQ BrokerService[
util0041.wdc.sl.spinn3r.com] Task-3

.. the other main problem I’m having is that JMX is telling me that I have
a LARGE number of messages in some of our queues, but they’re not being
processed.

It also appears that AMQ is having corrupt JMX values because some of my
queues have *negative* sizes. which obviously makes no sense.

For example, right now it’s saying there are 15k messages in our dead
letter queue.  However, when I try to browse it, nothing is returned.

I’ve had this problem before, and the only resolution has been to
completely scrap our full Level DB database by stopping AMQ, removing the
directory, then starting it again.

Then I have to re-enqueue all of our messages.

This isn’t scalable obviously and I need to track down why AMQ keeps
corrupting itself.

Kevin


—
Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message