activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Davies <rajdav...@gmail.com>
Subject Re: IndexOutOfBoundsException in AMQMessageStore using 20071028.173739
Date Fri, 02 Nov 2007 22:10:23 GMT
I wonder if you could take the latest again?


cheers,

Rob

http://rajdavies.blogspot.com/



On Nov 2, 2007, at 7:26 AM, Andrew Miller wrote:

>
>
> Rob,
>
> Yeah, in general, I've been deleting the entire local store  
> directory in
> between tests, since we're just testing it is not critical that the  
> messages
> are persisted. (with the message queue down of course)   
> Interestingly, I
> restarted the broker to try to fix this problem.  It marked the  
> store as bad
> and regenerated the index from the journal (about an hour+ with  
> ~1.5-2m
> messages).
>
> The consumers started working again at that point, for a while.   
> After it
> reached ~7200 dispatched messages, it really blew up.  I got this  
> in the
> wrapper.log:
>
> INFO   | jvm 1    | 2007/11/01 20:30:09 |  
> java.lang.NullPointerException
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.MessageIdMarshaller.writePayload 
> (MessageIdMarshaller.java:38)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.MessageIdMarshaller.writePayload 
> (MessageIdMarshaller.java:29)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.impl.index.hash.HashEntry.write 
> (HashEntry.java:93)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.impl.index.hash.HashPage.write 
> (HashPage.java:123)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.impl.index.hash.HashIndex.writeFullPage 
> (HashIndex.java:350)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.impl.index.hash.HashPageInfo.end 
> (HashPageInfo.java:98)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.impl.index.hash.HashBin.end(HashBin.java:299)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.impl.index.hash.HashBin.put(HashBin.java:153)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.impl.index.hash.HashIndex.store 
> (HashIndex.java:241)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.kaha.impl.container.MapContainerImpl.put 
> (MapContainerImpl.java:303)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.store.kahadaptor.KahaReferenceStore.addMessageRefe 
> rence(KahaReferenceStore.java:113)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.store.amq.AMQMessageStore$4.execute 
> (AMQMessageStore.java:341)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.util.TransactionTemplate.run 
> (TransactionTemplate.java:44)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.store.amq.AMQMessageStore.doAsyncWrite 
> (AMQMessageStore.java:330)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.store.amq.AMQMessageStore.asyncWrite 
> (AMQMessageStore.java:300)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.store.amq.AMQMessageStore$1.iterate 
> (AMQMessageStore.java:92)
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.thread.DedicatedTaskRunner.runTask 
> (DedicatedTaskRunner.java:9
> INFO   | jvm 1    | 2007/11/01 20:30:09 |       at
> org.apache.activemq.thread.DedicatedTaskRunner$1.run 
> (DedicatedTaskRunner.java:36)
>
> After that stack trace, I start getting a lot more  
> NullPointerExceptions in
> the logs, however, they don't have a corresponding stack trace:
>
> INFO   | jvm 1    | 2007/11/01 20:30:45 | WARN  AMQMessageStore
> - Message could not be added to long term store: null
> INFO   | jvm 1    | 2007/11/01 20:30:45 |  
> java.lang.NullPointerException
> INFO   | jvm 1    | 2007/11/01 20:30:45 | WARN  AMQMessageStore
> - Message could not be added to long term store: null
> INFO   | jvm 1    | 2007/11/01 20:30:45 |  
> java.lang.NullPointerException
>
> Then, a different kind of exception, with stack trace:
>
> INFO   | jvm 1    | 2007/11/01 20:31:20 | ERROR QueueStorePrefetch
> - Failed to fill batch
> INFO   | jvm 1    | 2007/11/01 20:31:20 |  
> java.lang.NullPointerException
> INFO   | jvm 1    | 2007/11/01 20:31:20 | ERROR IG
> - Failed to page in more queue messages
> INFO   | jvm 1    | 2007/11/01 20:31:20 | java.lang.RuntimeException:
> java.lang.NullPointerException
> INFO   | jvm 1    | 2007/11/01 20:31:20 |       at
> org.apache.activemq.broker.region.cursors.QueueStorePrefetch.hasNext 
> (QueueStorePrefetch.java:112)
> INFO   | jvm 1    | 2007/11/01 20:31:20 |       at
> org.apache.activemq.broker.region.cursors.StoreQueueCursor.hasNext 
> (StoreQueueCursor.java:129)
> INFO   | jvm 1    | 2007/11/01 20:31:20 |       at
> org.apache.activemq.broker.region.Queue.buildList(Queue.java:1022)
> INFO   | jvm 1    | 2007/11/01 20:31:20 |       at
> org.apache.activemq.broker.region.Queue.doPageIn(Queue.java:1005)
> INFO   | jvm 1    | 2007/11/01 20:31:20 |       at
> org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java: 
> 1068)
> INFO   | jvm 1    | 2007/11/01 20:31:20 |       at
> org.apache.activemq.broker.region.Queue.iterate(Queue.java:941)
> INFO   | jvm 1    | 2007/11/01 20:31:20 |       at
> org.apache.activemq.thread.DedicatedTaskRunner.runTask 
> (DedicatedTaskRunner.java:98)
> INFO   | jvm 1    | 2007/11/01 20:31:20 |       at
> org.apache.activemq.thread.DedicatedTaskRunner$1.run 
> (DedicatedTaskRunner.java:36)
> INFO   | jvm 1    | 2007/11/01 20:31:20 | Caused by:
> java.lang.NullPointerException
>
> A couple other stack traces that look a lot like that, and then  
> tons of:
>
> INFO   | jvm 1    | 2007/11/01 20:31:20 | ERROR QueueStorePrefetch
> - Failed to fill batch
> INFO   | jvm 1    | 2007/11/01 20:31:20 |  
> java.lang.NullPointerException
> INFO   | jvm 1    | 2007/11/01 20:31:20 | ERROR QueueStorePrefetch
> - Failed to fill batch
> INFO   | jvm 1    | 2007/11/01 20:31:20 |  
> java.lang.NullPointerException
> INFO   | jvm 1    | 2007/11/01 20:31:20 | ERROR QueueStorePrefetch
> - Failed to fill batch
>
>
> (Sorry, I haven't searched JIRA/Google for this one yet so this may  
> be a
> known issue)
>
> I reran most of the test with the journaledJDBC connector but it's not
> nearly as fast as the local store so I'm hoping to get this  
> working.  :-)
> I've made the indexBinSize change you've recommended and I'll re- 
> run the
> tests tomorrow.  BTW thanks for putting together that amq-message- 
> store page
> on the wiki. Thanks for the assistance!
>
> -Andy
>
>
>
>
>
>
>
>
>
> rajdavies wrote:
>>
>> Hi andy,
>>
>> did you start with a completely new AMQ Store data directory ? - You
>> could delete the references store only - e.g. the Recovery section in
>> http://activemq.apache.org/amq-message-store.html
>> Also - as you are storing a few million messages - you might want to
>> change the number of indexBinSize - the default number of has bins -
>> parameter to something large - like 16000 (the default is 1024).
>>
>> I'l see if I can reproduce your latest problem with the
>> RecoveryListenerAdaptor.
>>
>> cheers,
>>
>> Rob
>>
>> http://rajdavies.blogspot.com/
>>
>>
>>
>> On Nov 2, 2007, at 2:16 AM, Andrew Miller wrote:
>>
>>>
>>>
>>> Thanks for the fix!  We're running the test now...  Parameters:
>>>
>>> SVN version: 591163
>>> Client jar: Updated to 5.0-20071028.173739
>>>
>>> We haven't seen any of the original IndexOutOfBounds exception
>>> messages,
>>> even after about 1.2 million Enqueues.   However, the consumers
>>> appear to be
>>> stuck and unable to consume.     We had a RecoveryListenerAdapter
>>> message
>>> show up in the log, though I'm not sure this correlates with the
>>> inability
>>> of the consumers to consume these messages.
>>>
>>> 2007-11-01 18:10:42,645 [68.194.21:40095] ERROR
>>> RecoveryListenerAdapter
>>> - Message id ID:rlab-0007.xxxx.com-48174-119396578373
>>> 6-0:2:1:1:11 could not be recovered from the data store!
>>>
>>> I even wrote a little perl client that does nothing but subscribe
>>> to this
>>> queue, consumes one message, and exits.  The script just blocks and
>>> then
>>> times out, even though there are many messages in the queue:
>>>
>>> rlab-0007:/opt/activemq/bin# ./query -QQueue=IG
>>> ACTIVEMQ_HOME: /opt/activemq
>>> ACTIVEMQ_BASE: /opt/activemq
>>> Type = Queue
>>> DispatchCount = 213
>>> Destination = IG
>>> MaxEnqueueTime = 1371
>>> QueueSize = 926441
>>> Name = IG
>>> DequeueCount = 213
>>> MemoryPercentageUsed = 0
>>> ConsumerCount = 19
>>> MemoryLimit = 30198988
>>> EnqueueCount = 1190424
>>> MinEnqueueTime = 1
>>> AverageEnqueueTime = 196.80751173708921
>>>
>>> Here's the one-off perl script and the output: (I assume there's no
>>> reason I
>>> would be unable to consume a message sent via JMS in stomp/perl?)
>>> (redacting script since my previous mail on this subject got
>>> rejected for
>>> being too spammy - suffice to say it simply subscribes to the  
>>> queue in
>>> question and does a stomp->receive_frame())
>>>
>>> amiller@mirror-01:/opt/amiller$ ./c1.pl
>>> connected
>>> could not receive message after 20 seconds at ./c1.pl line 17,
>>> <GEN0> line
>>> 4.
>>>
>>> We also had this problem with the 20071031 snapshot, but at that
>>> time I
>>> thought it was because we were using the 4.1.1 client jars.  BTW
>>> even though
>>> we didn't see this issue in 20071002, we needed to upgrade to solve
>>> AMQ-1479.
>>>
>>> Thanks again!
>>> -Andy
>>>
>>>
>>>
>>>
>>> rajdavies wrote:
>>>>
>>>> There's a potential fix for this I've just added to trunk -  
>>>> would it
>>>> be possible to build the latest and test ?
>>>>
>>>>
>>>> cheers,
>>>>
>>>> Rob
>>>>
>>>> http://rajdavies.blogspot.com/
>>>>
>>>>
>>>>
>>>> On Nov 1, 2007, at 9:47 PM, Andrew Miller wrote:
>>>>
>>>>>
>>>>> We've been using ActiveMQ in an infrastructure refresh project,  
>>>>> and
>>>>> have run
>>>>> into a nasty looking bug, it looks like we are dropping
>>>>> messages.    We have
>>>>> a few fast producers that will produce approximately 4-5 million
>>>>> persistent
>>>>> messages over the course of about 2.5 hours, while many slower
>>>>> consumers
>>>>> will consume these messages over the course of a day or so.
>>>>>
>>>>> We are using Jencks/Spring for the producers and consumers, using
>>>>> the
>>>>> openwire protocol.  For this test, we were using the AMQ 4.1.1
>>>>> client jars,
>>>>> if that matters.  (We will try updating to the new AMQ5 client  
>>>>> jars
>>>>> today)
>>>>>
>>>>> About halfway through the production run, we started getting
>>>>> IndexOutOfBounds exceptions in the log pretty frequently.   
>>>>> Later on
>>>>> the
>>>>> consumer tries to get these messages but appears to fail.
>>>>
>>>> (exception messages trimmed)
>>>>
>>>>
>>>
>>> -- 
>>> View this message in context: http://www.nabble.com/
>>> IndexOutOfBoundsException-in-AMQMessageStore-using-20071028.173739-
>>> tf4734264s2354.html#a13541167
>>> Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
>>>
>>
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/ 
> IndexOutOfBoundsException-in-AMQMessageStore-using-20071028.173739- 
> tf4734264s2354.html#a13543207
> Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
>


Mime
View raw message