activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jk@penguinsfan.com" ...@penguinsfan.com>
Subject Re: Statements.java, etc. Performance
Date Sun, 19 Nov 2006 13:50:41 GMT

                   
rajdavies wrote:
>   so the discussion is  probably better on the dev list. 

Apologies, I accidentally posted to the wrong one.

                       
rajdavies wrote:
>  Now this is  functionality that is due for official release in 4.2 

Okay, so now I understand why I couldn't find this stuff in the
documentation....

                          
rajdavies wrote:
>  You could try the Kaha store 

Kaha store has no transaction log to protect against power failure, or did I
miss something when I looked at its code?

I was also experiencing some exceptions with Kaha (but I wasn't using the
latest in trunk so maybe I'll try again).

             
rajdavies wrote:
>  Messages exist in the table  - after they have been deleted by a 
> subscriber - there is one message table shared by many subscribers. 

Does this really matter?  It looks like the only thing that the queries in
question do is make sure that the subscriber row exists in the ack table. 
They don't look at the ack sequeincing ID or anything like that.  So, what
would be the harm if the join were omitted?

BTW, any ideas on how to make things work with Derby?  I think I'd like to
use it instead of say MySQL.

Jason


rajdavies wrote:
> 
> You only hit this if you  are using durable cursors for durable  
> subscribers - with the default JDBC database. Now this is  
> functionality that is due for official release in 4.2 (though it does  
> exist in 4.0.2) - and not documented any where - so the discussion is  
> probably better on the dev list. Other answers inline:
> 
> 
> On 19 Nov 2006, at 02:53, jk@penguinsfan.com wrote:
> 
>>
>> I am trying to use persistent storage with a large number of  
>> messages and I
>> think I'm running into trouble with the JDBC statements.
>>
>> In particular, I'm trying to use Derby, but I think there are some  
>> general
>> performance problems with the SQL statements.
>>
>> 1.  Many of the SELECT statements actually return many rows.
>>     2a.  Right before the JDBC call is made, setMaxRows() should be  
>> called
>> on the statement with the maximum number of rows to be captured.
> setMaxRows is a hint - and as you mention later on down the post, has  
> no affect on Derby - but please go ahead and raise a jira
> 
>>     2b.  A "TOP" or "LIMIT" clause should be a part of these SQL  
>> SELECT
>> statements.  Otherwise, the database can attempt to sort a huge  
>> number of
>> rows since these also contain an ORDER BY.  Unfortunately, there is no
>> standard syntax for this -- it varies depending on the RDBMS.  The  
>> current
>> implementation stops after reading x number rows, but it is too  
>> late, the
>> database may have already done a lot of work (demonstrated below).   
>> For some
>> RDBMS, the numeric argument to TOP/LIMIT may be parameterized,  
>> other times
>> it must be put directly in the SQL string.
> Great - you can raise a jira or submit a patch - all are welcome. We  
> don't always have access to try out Oracle or databases. However,  
> there is scope for using specific select statements for specific  
> databases.
>>
>> I think the statements affected are:
>>    getFindDurableSubMessagesStatement
>>    getNextDurableSubscriberMessageStatement
>>    getPrevDurableSubscriberMessageIdStatement
>>    getNextDurableSubscriberMessageIdStatement
> 
> Some of these statements are not actually used anywhere - like  
> getNextDurableSubscriberMessageStatement and  
> getPrevDurableSubscriberMessageIdStatement...
> 
>> -- Note:  The next and prev statements can probably be rewritten in  
>> a way
>> that uses MIN or MAX to avoid needing to use TOP/LIMIT.  The most  
>> problems
>> come with the getFindDurableSubMessagesStatement.
>>
>> 2.  Why do many of these statements bother to join to the ACKS  
>> table?  Some
>> of them do not obtain any information from these tables regarding  
>> which
>> messages are to be retrieved (maybe this is needed to make sure an  
>> entry
>> exists there, but can't this be figured out just by the IDs/messages?)
> Messages exist in the table  - after they have been deleted by a  
> subscriber - there is one message table shared by many subscribers.
>>
>> 3. I believe the index on (MSGID_PROD,MSGID_SEQ) can be declared as  
>> a UNIQUE
>> INDEX.  This helps out some optimizers, not sure how much it would  
>> affect
>> ActiveMQ.
>>
>> 4. The prepared statements probably should be cached/pooled (in  
>> whatever way
>> is thread-safe), but I think this is the least of the problems.
>>
>> Really bad performance due to this was observed with Derby. Happens  
>> even
>> when I submit the commands directly to Derby, bypassing ActiveMQ.  The
>> setMaxRows did not fix it, and Derby does not provide any form of  
>> TOP or
>> LIMIT.  :(  Tried against the latest release of Derby and no luck  
>> there (in
>> fact, I was getting out of memory error manually issuing the SQL  
>> statements
>> to the database).
>>
>> I am wondering as to what kind of persistence people here  
>> recommend?  I may
>> need to persist millions of records.
> 
> You could try the Kaha store -
> 
>>
>> *** Relevant part of the optimizer log (notice that Rows Visited =  
>> 100299,
>> even though only 100 are desired).
>> 		Right result set:
>> 			Table Scan ResultSet for ACTIVEMQ_MSGS at read committed  
>> isolation level
>> using instantaneous share row locking chosen by the optimizer
>> 			Number of opens = 1
>> 			Rows seen = 100287
>> 			Rows filtered = 0
>> 			Fetch Size = 16
>> 				constructor time (milliseconds) = 0
>> 				open time (milliseconds) = 0
>> 				next time (milliseconds) = 0
>> 				close time (milliseconds) = 0
>> 				next time in milliseconds/row = 0
>>
>> 			scan information:
>> 				Bit set of columns fetched={0, 1, 5}
>> 				Number of columns fetched=3
>> 				Number of pages visited=4777
>> 				Number of rows qualified=100287
>> 				Number of rows visited=100299
>> 				Scan type=heap
>> 				start position:
>> null				stop position:
>> null				qualifiers:
>>
>> -- 
>> View this message in context: http://www.nabble.com/Statements.java% 
>> 2C-etc.-Performance-tf2662372.html#a7425760
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
> 
> 
> 


rajdavies wrote:
> 
> You only hit this if you  are using durable cursors for durable  
> subscribers - with the default JDBC database. Now this is  
> functionality that is due for official release in 4.2 (though it does  
> exist in 4.0.2) - and not documented any where - so the discussion is  
> probably better on the dev list. Other answers inline:
> 
> 
> On 19 Nov 2006, at 02:53, jk@penguinsfan.com wrote:
> 
>>
>> I am trying to use persistent storage with a large number of  
>> messages and I
>> think I'm running into trouble with the JDBC statements.
>>
>> In particular, I'm trying to use Derby, but I think there are some  
>> general
>> performance problems with the SQL statements.
>>
>> 1.  Many of the SELECT statements actually return many rows.
>>     2a.  Right before the JDBC call is made, setMaxRows() should be  
>> called
>> on the statement with the maximum number of rows to be captured.
> setMaxRows is a hint - and as you mention later on down the post, has  
> no affect on Derby - but please go ahead and raise a jira
> 
>>     2b.  A "TOP" or "LIMIT" clause should be a part of these SQL  
>> SELECT
>> statements.  Otherwise, the database can attempt to sort a huge  
>> number of
>> rows since these also contain an ORDER BY.  Unfortunately, there is no
>> standard syntax for this -- it varies depending on the RDBMS.  The  
>> current
>> implementation stops after reading x number rows, but it is too  
>> late, the
>> database may have already done a lot of work (demonstrated below).   
>> For some
>> RDBMS, the numeric argument to TOP/LIMIT may be parameterized,  
>> other times
>> it must be put directly in the SQL string.
> Great - you can raise a jira or submit a patch - all are welcome. We  
> don't always have access to try out Oracle or databases. However,  
> there is scope for using specific select statements for specific  
> databases.
>>
>> I think the statements affected are:
>>    getFindDurableSubMessagesStatement
>>    getNextDurableSubscriberMessageStatement
>>    getPrevDurableSubscriberMessageIdStatement
>>    getNextDurableSubscriberMessageIdStatement
> 
> Some of these statements are not actually used anywhere - like  
> getNextDurableSubscriberMessageStatement and  
> getPrevDurableSubscriberMessageIdStatement...
> 
>> -- Note:  The next and prev statements can probably be rewritten in  
>> a way
>> that uses MIN or MAX to avoid needing to use TOP/LIMIT.  The most  
>> problems
>> come with the getFindDurableSubMessagesStatement.
>>
>> 2.  Why do many of these statements bother to join to the ACKS  
>> table?  Some
>> of them do not obtain any information from these tables regarding  
>> which
>> messages are to be retrieved (maybe this is needed to make sure an  
>> entry
>> exists there, but can't this be figured out just by the IDs/messages?)
> Messages exist in the table  - after they have been deleted by a  
> subscriber - there is one message table shared by many subscribers.
>>
>> 3. I believe the index on (MSGID_PROD,MSGID_SEQ) can be declared as  
>> a UNIQUE
>> INDEX.  This helps out some optimizers, not sure how much it would  
>> affect
>> ActiveMQ.
>>
>> 4. The prepared statements probably should be cached/pooled (in  
>> whatever way
>> is thread-safe), but I think this is the least of the problems.
>>
>> Really bad performance due to this was observed with Derby. Happens  
>> even
>> when I submit the commands directly to Derby, bypassing ActiveMQ.  The
>> setMaxRows did not fix it, and Derby does not provide any form of  
>> TOP or
>> LIMIT.  :(  Tried against the latest release of Derby and no luck  
>> there (in
>> fact, I was getting out of memory error manually issuing the SQL  
>> statements
>> to the database).
>>
>> I am wondering as to what kind of persistence people here  
>> recommend?  I may
>> need to persist millions of records.
> 
> You could try the Kaha store -
> 
>>
>> *** Relevant part of the optimizer log (notice that Rows Visited =  
>> 100299,
>> even though only 100 are desired).
>> 		Right result set:
>> 			Table Scan ResultSet for ACTIVEMQ_MSGS at read committed  
>> isolation level
>> using instantaneous share row locking chosen by the optimizer
>> 			Number of opens = 1
>> 			Rows seen = 100287
>> 			Rows filtered = 0
>> 			Fetch Size = 16
>> 				constructor time (milliseconds) = 0
>> 				open time (milliseconds) = 0
>> 				next time (milliseconds) = 0
>> 				close time (milliseconds) = 0
>> 				next time in milliseconds/row = 0
>>
>> 			scan information:
>> 				Bit set of columns fetched={0, 1, 5}
>> 				Number of columns fetched=3
>> 				Number of pages visited=4777
>> 				Number of rows qualified=100287
>> 				Number of rows visited=100299
>> 				Scan type=heap
>> 				start position:
>> null				stop position:
>> null				qualifiers:
>>
>> -- 
>> View this message in context: http://www.nabble.com/Statements.java% 
>> 2C-etc.-Performance-tf2662372.html#a7425760
>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Re%3A-Statements.java%2C-etc.-Performance-tf2662998.html#a7429191
Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.


Mime
View raw message