activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pablo Lozano (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (AMQ-5300) Inifinite loop when attempting to replay levelDB logs to rebuild index
Date Fri, 30 Jan 2015 04:54:35 GMT

    [ https://issues.apache.org/jira/browse/AMQ-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298185#comment-14298185
] 

Pablo Lozano edited comment on AMQ-5300 at 1/30/15 4:53 AM:
------------------------------------------------------------

Hi Good day,

It seems this issue is not fixed as I have been able to replicated on Replicated Level DB
and the lastest 5.11 snapshot. The levelDB corrupts even at the point that every time I kill
current master the next slave to take the position starts doing the infinite loop. The only
way to fix this is to delete the leveldb data from all ActiveMQ instances which obviously
lets me without messages. I find this issue to be quite critical as it occurs even on graceful
shut downs of ActiveMQ.

I have a attached a copy of the logs and my levelDB directory. (If all messages on the Queue
look the same is because they are, for testing purposes i send the same message over and over)
https://drive.google.com/file/d/0B6ANh1aTzRg3S2Q2SVRUY0ZhT0k/view?usp=sharing


My settings are:
Ubuntu 14.04 64bit
leveldb jin linux-x64

            <replicatedLevelDB
                    directory="${mailSystem.activeMQ.rebDB}"
                    replicas="3"
                    sync="local_mem"
                    logSize="25413000"
                    indexCompression="none"
                    zkAddress="lstkmy90430:2181,lstkmy36606:2181,lstkmy52108:2181"
                    zkPath="/activemq/leveldb-stores"
                    />




was (Author: altaflux):
Hi Good day,

It seems this issue is not fixed as I have been able to replicated on Replicated Level DB
and the lastest 5.11 snapshot. The levelDB corrupts even at the point that every time I kill
current master the next slave to take the position starts doing the infinite loop. The only
way to fix this is to delete the leveldb data from all ActiveMQ instances which obviously
lets me without messages. I find this issue to be quite critical as it occurs even on graceful
shut downs of ActiveMQ.

I have a attached a copy of the logs and my levelDB directory. (If all messages on the Queue
look the same is because they are, for testing purposes i send the same message over and over)
https://drive.google.com/file/d/0B6ANh1aTzRg3S2Q2SVRUY0ZhT0k/view?usp=sharing


My settings are:


            <replicatedLevelDB
                    directory="${mailSystem.activeMQ.rebDB}"
                    replicas="3"
                    sync="local_mem"
                    logSize="25413000"
                    indexCompression="none"
                    zkAddress="lstkmy90430:2181,lstkmy36606:2181,lstkmy52108:2181"
                    zkPath="/activemq/leveldb-stores"
                    />



> Inifinite loop when attempting to replay levelDB logs to rebuild index
> ----------------------------------------------------------------------
>
>                 Key: AMQ-5300
>                 URL: https://issues.apache.org/jira/browse/AMQ-5300
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: activemq-leveldb-store
>    Affects Versions: 5.9.1, 5.10.0
>         Environment: Linux
>            Reporter: Vu Le
>            Assignee: Gary Tully
>             Fix For: 5.11.0
>
>
> While searching for a workaround for issue AMQ-5284, I came across this issue.
> To work around the serialization issue (AMQ-5284), I deleted the index snapshots from
the LevelDB datastore. This will replay the logs to regenerate the index. However, if a log
rotation has already occurred, you will get an infinite loop upon restart.
> Here are the steps to reproduce what I am seeing:
> Configure ActiveMQ 5.10.0 to use a LevelDB data store with the log size of about 1MB.
> {code}
> <persistenceAdapter>
>     <levelDB directory="${activemq.data}/leveldb" logSize="1000000" />
> </persistenceAdapter>
> {code}
> Then I started up the broker and published 10,000 persistent messages to a queue, causing
the log files to rotate (twice in my case). I see the following files in the data store folder:
> {code}
> -rw-rw-r--. 1 user users 1000071 Jul 30 11:15 0000000000000000.log
> -rw-rw-r--. 1 user users 1000009 Jul 30 11:16 00000000000f4287.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:16 00000000001e84d0.index
> -rw-rw-r--. 1 user users 1000000 Jul 30 11:17 00000000001e84d0.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 dirty.index
> -rw-rw-r--. 1 user users       0 Jul 30 11:11 lock
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 plist.index
> -rw-rw-r--. 1 user users      24 Jul 30 11:11 store-version.txt
> {code}
> I then consume 5,000 messages, which causes the first log to be deleted since it is no
longer being referenced. I see the following log statements:
> {code}
> 2014-07-30 11:29:14,960 | DEBUG | Log no longer referenced: 0 | org.apache.activemq.leveldb.LevelDBClient
| Thread-2
> 2014-07-30 11:29:14,967 | DEBUG | Deleting log at 0 | org.apache.activemq.leveldb.LevelDBClient
| Thread-2
> {code}
> And I see the remaining files in the data store folder (notice the 0000000000000000.log
is gone):
> {code}
> -rw-rw-r--. 1 user users 1000009 Jul 30 11:16 00000000000f4287.log
> -rw-rw-r--. 1 user users 1000011 Jul 30 11:29 00000000001e84d0.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:29 00000000002dc71b.index
> -rw-rw-r--. 1 user users 1000000 Jul 30 11:29 00000000002dc71b.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 dirty.index
> -rw-rw-r--. 1 user users       0 Jul 30 11:11 lock
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 plist.index
> -rw-rw-r--. 1 user users      24 Jul 30 11:11 store-version.txt
> {code}
> At this point, I shut down the broker and here is the listing of what's left in the data
store:
> {code}
> -rw-rw-r--. 1 user users 1000009 Jul 30 11:16 00000000000f4287.log
> -rw-rw-r--. 1 user users 1000011 Jul 30 11:29 00000000001e84d0.log
> -rw-rw-r--. 1 user users 1000000 Jul 30 11:29 00000000002dc71b.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:36 0000000000301737.index
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 dirty.index
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 plist.index
> -rw-rw-r--. 1 user users      24 Jul 30 11:11 store-version.txt
> {code}
> I then delete the index folder within the data store (in my case "0000000000301737.index").
I am doing this to force a replay of the logs to regenerate the index (due to the serialization
issue I ran into).
> And finally, this is the message I am getting once I start the broker back up (infinite
loop of this same message, and I have to shut down the broker):
> {code}
> 2014-07-30 11:40:27,415 | WARN  | No reader available for position: 0, log_infos: {1000071=LogInfo(/home/user/apache-activemq-5.10.0/data/leveldb/00000000000f4287.log,1000071,1000009),
2000080=LogInfo(/home/user/apache-activemq-5.10.0/data/leveldb/00000000001e84d0.log,2000080,1000011),
3000091=LogInfo(/home/user/apache-activemq-5.10.0/data/leveldb/00000000002dc71b.log,3000091,0)}
| org.apache.activemq.leveldb.RecordLog | main
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message