activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Davies <rajdav...@gmail.com>
Subject Re: Statements.java, etc. Performance
Date Sun, 19 Nov 2006 17:39:26 GMT
Hi Jason,

On 19 Nov 2006, at 13:50, jk@penguinsfan.com wrote:

>
>
> rajdavies wrote:
>>   so the discussion is  probably better on the dev list.
>
> Apologies, I accidentally posted to the wrong one.
>
>
> rajdavies wrote:
>>  Now this is  functionality that is due for official release in 4.2
>
> Okay, so now I understand why I couldn't find this stuff in the
> documentation....
>
>
> rajdavies wrote:
>>  You could try the Kaha store
>
> Kaha store has no transaction log to protect against power failure,  
> or did I
> miss something when I looked at its code?

there's a combination of using Kaha and the journal  - it's currently  
still under development.
>
> I was also experiencing some exceptions with Kaha (but I wasn't  
> using the
> latest in trunk so maybe I'll try again).
Some gemlins appear to have crept in there recently - not sure what's  
changed - but investigating
>
>
> rajdavies wrote:
>>  Messages exist in the table  - after they have been deleted by a
>> subscriber - there is one message table shared by many subscribers.
>
> Does this really matter?  It looks like the only thing that the  
> queries in
> question do is make sure that the subscriber row exists in the ack  
> table.
> They don't look at the ack sequeincing ID or anything like that.   
> So, what
> would be the harm if the join were omitted?

None - but I think those statements are no longer used - will remove  
them or fix them if they are still required.

>
> BTW, any ideas on how to make things work with Derby?  I think I'd  
> like to
> use it instead of say MySQL.

I'm afraid not  - any Derby experts out there ?
>
> Jason
>
>
> rajdavies wrote:
>>
>> You only hit this if you  are using durable cursors for durable
>> subscribers - with the default JDBC database. Now this is
>> functionality that is due for official release in 4.2 (though it does
>> exist in 4.0.2) - and not documented any where - so the discussion is
>> probably better on the dev list. Other answers inline:
>>
>>
>> On 19 Nov 2006, at 02:53, jk@penguinsfan.com wrote:
>>
>>>
>>> I am trying to use persistent storage with a large number of
>>> messages and I
>>> think I'm running into trouble with the JDBC statements.
>>>
>>> In particular, I'm trying to use Derby, but I think there are some
>>> general
>>> performance problems with the SQL statements.
>>>
>>> 1.  Many of the SELECT statements actually return many rows.
>>>     2a.  Right before the JDBC call is made, setMaxRows() should be
>>> called
>>> on the statement with the maximum number of rows to be captured.
>> setMaxRows is a hint - and as you mention later on down the post, has
>> no affect on Derby - but please go ahead and raise a jira
>>
>>>     2b.  A "TOP" or "LIMIT" clause should be a part of these SQL
>>> SELECT
>>> statements.  Otherwise, the database can attempt to sort a huge
>>> number of
>>> rows since these also contain an ORDER BY.  Unfortunately, there  
>>> is no
>>> standard syntax for this -- it varies depending on the RDBMS.  The
>>> current
>>> implementation stops after reading x number rows, but it is too
>>> late, the
>>> database may have already done a lot of work (demonstrated below).
>>> For some
>>> RDBMS, the numeric argument to TOP/LIMIT may be parameterized,
>>> other times
>>> it must be put directly in the SQL string.
>> Great - you can raise a jira or submit a patch - all are welcome. We
>> don't always have access to try out Oracle or databases. However,
>> there is scope for using specific select statements for specific
>> databases.
>>>
>>> I think the statements affected are:
>>>    getFindDurableSubMessagesStatement
>>>    getNextDurableSubscriberMessageStatement
>>>    getPrevDurableSubscriberMessageIdStatement
>>>    getNextDurableSubscriberMessageIdStatement
>>
>> Some of these statements are not actually used anywhere - like
>> getNextDurableSubscriberMessageStatement and
>> getPrevDurableSubscriberMessageIdStatement...
>>
>>> -- Note:  The next and prev statements can probably be rewritten  
>>> in  
>>> a way
>>> that uses MIN or MAX to avoid needing to use TOP/LIMIT.  The most
>>> problems
>>> come with the getFindDurableSubMessagesStatement.
>>>
>>> 2.  Why do many of these statements bother to join to the ACKS
>>> table?  Some
>>> of them do not obtain any information from these tables regarding
>>> which
>>> messages are to be retrieved (maybe this is needed to make sure an
>>> entry
>>> exists there, but can't this be figured out just by the IDs/ 
>>> messages?)
>> Messages exist in the table  - after they have been deleted by a
>> subscriber - there is one message table shared by many subscribers.
>>>
>>> 3. I believe the index on (MSGID_PROD,MSGID_SEQ) can be declared as
>>> a UNIQUE
>>> INDEX.  This helps out some optimizers, not sure how much it would
>>> affect
>>> ActiveMQ.
>>>
>>> 4. The prepared statements probably should be cached/pooled (in
>>> whatever way
>>> is thread-safe), but I think this is the least of the problems.
>>>
>>> Really bad performance due to this was observed with Derby. Happens
>>> even
>>> when I submit the commands directly to Derby, bypassing  
>>> ActiveMQ.  The
>>> setMaxRows did not fix it, and Derby does not provide any form of
>>> TOP or
>>> LIMIT.  :(  Tried against the latest release of Derby and no luck
>>> there (in
>>> fact, I was getting out of memory error manually issuing the SQL
>>> statements
>>> to the database).
>>>
>>> I am wondering as to what kind of persistence people here
>>> recommend?  I may
>>> need to persist millions of records.
>>
>> You could try the Kaha store -
>>
>>>
>>> *** Relevant part of the optimizer log (notice that Rows Visited =
>>> 100299,
>>> even though only 100 are desired).
>>> 		Right result set:
>>> 			Table Scan ResultSet for ACTIVEMQ_MSGS at read committed
>>> isolation level
>>> using instantaneous share row locking chosen by the optimizer
>>> 			Number of opens = 1
>>> 			Rows seen = 100287
>>> 			Rows filtered = 0
>>> 			Fetch Size = 16
>>> 				constructor time (milliseconds) = 0
>>> 				open time (milliseconds) = 0
>>> 				next time (milliseconds) = 0
>>> 				close time (milliseconds) = 0
>>> 				next time in milliseconds/row = 0
>>>
>>> 			scan information:
>>> 				Bit set of columns fetched={0, 1, 5}
>>> 				Number of columns fetched=3
>>> 				Number of pages visited=4777
>>> 				Number of rows qualified=100287
>>> 				Number of rows visited=100299
>>> 				Scan type=heap
>>> 				start position:
>>> null				stop position:
>>> null				qualifiers:
>>>
>>> -- 
>>> View this message in context: http://www.nabble.com/Statements.java%
>>> 2C-etc.-Performance-tf2662372.html#a7425760
>>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>>
>>
>>
>>
>
>
> rajdavies wrote:
>>
>> You only hit this if you  are using durable cursors for durable
>> subscribers - with the default JDBC database. Now this is
>> functionality that is due for official release in 4.2 (though it does
>> exist in 4.0.2) - and not documented any where - so the discussion is
>> probably better on the dev list. Other answers inline:
>>
>>
>> On 19 Nov 2006, at 02:53, jk@penguinsfan.com wrote:
>>
>>>
>>> I am trying to use persistent storage with a large number of
>>> messages and I
>>> think I'm running into trouble with the JDBC statements.
>>>
>>> In particular, I'm trying to use Derby, but I think there are some
>>> general
>>> performance problems with the SQL statements.
>>>
>>> 1.  Many of the SELECT statements actually return many rows.
>>>     2a.  Right before the JDBC call is made, setMaxRows() should be
>>> called
>>> on the statement with the maximum number of rows to be captured.
>> setMaxRows is a hint - and as you mention later on down the post, has
>> no affect on Derby - but please go ahead and raise a jira
>>
>>>     2b.  A "TOP" or "LIMIT" clause should be a part of these SQL
>>> SELECT
>>> statements.  Otherwise, the database can attempt to sort a huge
>>> number of
>>> rows since these also contain an ORDER BY.  Unfortunately, there  
>>> is no
>>> standard syntax for this -- it varies depending on the RDBMS.  The
>>> current
>>> implementation stops after reading x number rows, but it is too
>>> late, the
>>> database may have already done a lot of work (demonstrated below).
>>> For some
>>> RDBMS, the numeric argument to TOP/LIMIT may be parameterized,
>>> other times
>>> it must be put directly in the SQL string.
>> Great - you can raise a jira or submit a patch - all are welcome. We
>> don't always have access to try out Oracle or databases. However,
>> there is scope for using specific select statements for specific
>> databases.
>>>
>>> I think the statements affected are:
>>>    getFindDurableSubMessagesStatement
>>>    getNextDurableSubscriberMessageStatement
>>>    getPrevDurableSubscriberMessageIdStatement
>>>    getNextDurableSubscriberMessageIdStatement
>>
>> Some of these statements are not actually used anywhere - like
>> getNextDurableSubscriberMessageStatement and
>> getPrevDurableSubscriberMessageIdStatement...
>>
>>> -- Note:  The next and prev statements can probably be rewritten  
>>> in  
>>> a way
>>> that uses MIN or MAX to avoid needing to use TOP/LIMIT.  The most
>>> problems
>>> come with the getFindDurableSubMessagesStatement.
>>>
>>> 2.  Why do many of these statements bother to join to the ACKS
>>> table?  Some
>>> of them do not obtain any information from these tables regarding
>>> which
>>> messages are to be retrieved (maybe this is needed to make sure an
>>> entry
>>> exists there, but can't this be figured out just by the IDs/ 
>>> messages?)
>> Messages exist in the table  - after they have been deleted by a
>> subscriber - there is one message table shared by many subscribers.
>>>
>>> 3. I believe the index on (MSGID_PROD,MSGID_SEQ) can be declared as
>>> a UNIQUE
>>> INDEX.  This helps out some optimizers, not sure how much it would
>>> affect
>>> ActiveMQ.
>>>
>>> 4. The prepared statements probably should be cached/pooled (in
>>> whatever way
>>> is thread-safe), but I think this is the least of the problems.
>>>
>>> Really bad performance due to this was observed with Derby. Happens
>>> even
>>> when I submit the commands directly to Derby, bypassing  
>>> ActiveMQ.  The
>>> setMaxRows did not fix it, and Derby does not provide any form of
>>> TOP or
>>> LIMIT.  :(  Tried against the latest release of Derby and no luck
>>> there (in
>>> fact, I was getting out of memory error manually issuing the SQL
>>> statements
>>> to the database).
>>>
>>> I am wondering as to what kind of persistence people here
>>> recommend?  I may
>>> need to persist millions of records.
>>
>> You could try the Kaha store -
>>
>>>
>>> *** Relevant part of the optimizer log (notice that Rows Visited =
>>> 100299,
>>> even though only 100 are desired).
>>> 		Right result set:
>>> 			Table Scan ResultSet for ACTIVEMQ_MSGS at read committed
>>> isolation level
>>> using instantaneous share row locking chosen by the optimizer
>>> 			Number of opens = 1
>>> 			Rows seen = 100287
>>> 			Rows filtered = 0
>>> 			Fetch Size = 16
>>> 				constructor time (milliseconds) = 0
>>> 				open time (milliseconds) = 0
>>> 				next time (milliseconds) = 0
>>> 				close time (milliseconds) = 0
>>> 				next time in milliseconds/row = 0
>>>
>>> 			scan information:
>>> 				Bit set of columns fetched={0, 1, 5}
>>> 				Number of columns fetched=3
>>> 				Number of pages visited=4777
>>> 				Number of rows qualified=100287
>>> 				Number of rows visited=100299
>>> 				Scan type=heap
>>> 				start position:
>>> null				stop position:
>>> null				qualifiers:
>>>
>>> -- 
>>> View this message in context: http://www.nabble.com/Statements.java%
>>> 2C-etc.-Performance-tf2662372.html#a7425760
>>> Sent from the ActiveMQ - User mailing list archive at Nabble.com.
>>>
>>
>>
>>
>
> -- 
> View this message in context: http://www.nabble.com/Re%3A- 
> Statements.java%2C-etc.-Performance-tf2662998.html#a7429191
> Sent from the ActiveMQ - Dev mailing list archive at Nabble.com.
>


Mime
View raw message