db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Queries satisfiable from indexes
Date Tue, 20 Mar 2007 23:12:13 GMT
I agree with army, Derby can and often does satisfy queries from
indexes alone (I usually refer to this as covered query).

If you can't provide a reproducible case for us to run, some of the 
following may help us guess what is going on:

1) exact ddl of the tables
2) size estimate of data for any data columns that are variable length 
(ie. most likely vchar fields), from this we can roughly estimate size
of keys and see if 161,000 rows in 500 leaf pages makes sense.  Assuming 
4k index page, then this would be about 12 bytes a key, which seems 
reasonable for a 3 part key + a pointer to base table.
3) anything you can tell us about data distribution.
4) page size of tables/indexes.

My reading of this is that the query for whatever value you put in for ?
returned ~141,000 rows matching the value of mailbox_id that you put in,
and there are only 3 distinct values of mailbox_id, tags in the dataset.
It actually had to look through 161970 rows including the deleted rows
that have not been reclaimed yet.


Dan Karp wrote:
> I've got a few queries that should be satisfiable from an index without any access to
the underlying rows.  Can Derby do this, or does it load the rows regardless?  In this case,
would it help things if I'd used DISTINCT(tags) instead of a GROUP BY clause?
> 
> 
> QUERY:
> 
>   SELECT tags FROM mboxgroup1.mail_item
>   WHERE mailbox_id = ?
>   GROUP BY mailbox_id, tags;
> 
> 
> INDEX:
> 
>   CREATE INDEX ${DATABASE_NAME}.i_mail_item_tags_date
>     ON mail_item(mailbox_id, tags, date);
> 
> 
> PLAN:
> 
> Number of opens = 1
> Rows seen = 3
> Rows filtered = 0
> restriction = false
> projection = true
> 	constructor time (milliseconds) = 0
> 	open time (milliseconds) = 0
> 	next time (milliseconds) = 0
> 	close time (milliseconds) = 0
> 	restriction time (milliseconds) = 0
> 	projection time (milliseconds) = 0
> 	optimizer estimated row count:        14134.80
> 	optimizer estimated cost:         5049.44
> 
> Source result set:
> 	Sort ResultSet:
> 	Number of opens = 1
> 	Rows input = 141512
> 	Rows returned = 3
> 	Eliminate duplicates = true
> 	In sorted order = true
> 		constructor time (milliseconds) = 0
> 		open time (milliseconds) = 0
> 		next time (milliseconds) = 0
> 		close time (milliseconds) = 0
> 		optimizer estimated row count:        14134.80
> 		optimizer estimated cost:         5049.44
> 
> 	Source result set:
> 		Project-Restrict ResultSet (3):
> 		Number of opens = 1
> 		Rows seen = 141512
> 		Rows filtered = 0
> 		restriction = false
> 		projection = true
> 			constructor time (milliseconds) = 0
> 			open time (milliseconds) = 0
> 			next time (milliseconds) = 0
> 			close time (milliseconds) = 0
> 			restriction time (milliseconds) = 0
> 			projection time (milliseconds) = 0
> 			optimizer estimated row count:        14134.80
> 			optimizer estimated cost:         5049.44
> 
> 		Source result set:
> 			Index Scan ResultSet for MAIL_ITEM using index I_MAIL_ITEM_TAGS_DATE at read committed
isolation level using share row locking chosen by the optimizer
> 			Number of opens = 1
> 			Rows seen = 141512
> 			Rows filtered = 0
> 			Fetch Size = 1
> 				constructor time (milliseconds) = 0
> 				open time (milliseconds) = 0
> 				next time (milliseconds) = 0
> 				close time (milliseconds) = 0
> 				next time in milliseconds/row = 0
> 
> 			scan information: 
> 				Bit set of columns fetched={0, 1}
> 				Number of columns fetched=2
> 				Number of deleted rows visited=20458
> 				Number of pages visited=501
> 				Number of rows qualified=141512
> 				Number of rows visited=161970
> 				Scan type=btree
> 				Tree height=3
> 				start position: 
> 	>= on first 1 column(s).
> 	Ordered null semantics on the following columns: 
> 
> 				stop position: 
> 	> on first 1 column(s).
> 	Ordered null semantics on the following columns: 
> 
> 				qualifiers:
> None
> 				optimizer estimated row count:        14134.80
> 				optimizer estimated cost:         5049.44
> 
> 
> 


Mime
View raw message