activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kevin Burton <bur...@spinn3r.com>
Subject Re: massive data loss after 5.10 + leveldb restart?
Date Thu, 18 Dec 2014 20:57:00 GMT
I’m worried this is a sever LevelDB bug that’s been around for more than a
year without any fix…

https://issues.apache.org/jira/browse/AMQ-5300

seems to be relocated to log rolling and a restart. LevelDB ends up
corrupted if you restart it after the logs have rolled over.  At least that
would explain what I’m seeing.

On Thu, Dec 18, 2014 at 12:45 PM, Kevin Burton <burton@spinn3r.com> wrote:
>
> I’m trying to track down a bunch of concerning production bugs in ActiveMQ
> that might be related.
>
> The biggest problem I’m seeing is that on restart, it seems that we’re
> losing a LARGE percentage of our messages.
>
> We had about 7k in a queue, and on restart, it went down to almost zero.
>
> I’m getting LevelDB messages like:
>
> 2014-12-18 14:29:36,024 | WARN  | No reader available for position:
> eac7b1cad, log_infos:
> {81371951320=LogInfo(/var/lib/apache-activemq/leveldb/00000012f22570d8.log,81371951320,104858168),
> 81686526928=LogInfo(/var/lib/apache- … } |
> org.apache.activemq.leveldb.RecordLog | ActiveMQ BrokerService[
> util0041.wdc.sl.spinn3r.com] Task-3
>
> .. the other main problem I’m having is that JMX is telling me that I have
> a LARGE number of messages in some of our queues, but they’re not being
> processed.
>
> It also appears that AMQ is having corrupt JMX values because some of my
> queues have *negative* sizes. which obviously makes no sense.
>
> For example, right now it’s saying there are 15k messages in our dead
> letter queue.  However, when I try to browse it, nothing is returned.
>
> I’ve had this problem before, and the only resolution has been to
> completely scrap our full Level DB database by stopping AMQ, removing the
> directory, then starting it again.
>
> Then I have to re-enqueue all of our messages.
>
> This isn’t scalable obviously and I need to track down why AMQ keeps
> corrupting itself.
>
> Kevin
>
>
> —
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>
>

-- 

Founder/CEO Spinn3r.com
Location: *San Francisco, CA*
blog: http://burtonator.wordpress.com
… or check out my Google+ profile
<https://plus.google.com/102718274791889610666/posts>
<http://spinn3r.com>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message