db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Karp <dk...@zimbra.com>
Subject Re: Queries satisfiable from indexes
Date Tue, 20 Mar 2007 22:35:17 GMT
> I agree, though, that there is something odd about the number of pages
> visited and rows "qualified" here.  The top-most result set shows that
> only 3 rows were actually returned (I think...correct me if that's
> wrong), in which case it seems odd that we actually read 141,512
> "qualified" rows from disk.  I don't know much about the data in
> question but it seems like we should have done a limited range 
> scan using the parameter value ("?") as a start and stop key.  If that
> was the case then we should have scanned far fewer pages/rows.

I switched to using DISTINCT instead of GROUP BY and saw better-but-still-not-good performance
-- 0.4sec execution time instead of 1.2sec.  The cardinality of the TAGS column is 3, 4 if
you include NULL.  The cardinality of the MAILBOX_ID column is 1 in this case.  So I'd expect
either a scan of the entire index or some sort of (mailbox_id, tags) prefix scan of the index,
but in both cases I'd expect far fewer pages visited.

(This is with Derby 10.2.2.0.)  


QUERY:

   SELECT DISTINCT(tags) FROM mboxgroup1.mail_item WHERE mailbox_id = ?;


INDEX:

   CREATE INDEX ${DATABASE_NAME}.i_mail_item_tags_date
     ON mail_item(mailbox_id, tags, date DESC);


TABLE DDL:

   CREATE TABLE ${DATABASE_NAME}.mail_item (
      mailbox_id    INTEGER NOT NULL,
      id            INTEGER NOT NULL,
      type          SMALLINT NOT NULL,
      parent_id     INTEGER,
      folder_id     INTEGER,
      index_id      INTEGER,
      imap_id       INTEGER,
      date          INTEGER NOT NULL,
      size          INTEGER NOT NULL,
      volume_id     SMALLINT,
      blob_digest   VARCHAR(28),
      unread        INTEGER,
      flags         INTEGER NOT NULL DEFAULT 0,
      tags          BIGINT NOT NULL DEFAULT 0,
      sender        VARCHAR(128),
      subject       VARCHAR(1024),
      name          VARCHAR(128),
      metadata      CLOB,
      mod_metadata  INTEGER NOT NULL,
      change_date   INTEGER,
      mod_content   INTEGER NOT NULL,
      change_mask   INTEGER,

      PRIMARY KEY (mailbox_id, id),
      CONSTRAINT fk_mail_item_mailbox_id FOREIGN KEY (mailbox_id)
         REFERENCES zimbra.mailbox(id),
      CONSTRAINT fk_mail_item_volume_id FOREIGN KEY (volume_id)
         REFERENCES zimbra.volume(id),
      CONSTRAINT fk_mail_item_parent_id FOREIGN KEY (mailbox_id, parent_id)
         REFERENCES ${DATABASE_NAME}.mail_item(mailbox_id, id),
      CONSTRAINT fk_mail_item_folder_id FOREIGN KEY (mailbox_id, folder_id)
         REFERENCES ${DATABASE_NAME}.mail_item(mailbox_id, id)
   );


PLAN:

Number of opens = 1
Rows input = 141619
Rows returned = 3
Eliminate duplicates = true
In sorted order = true
	constructor time (milliseconds) = 0
	open time (milliseconds) = 0
	next time (milliseconds) = 0
	close time (milliseconds) = 0
	optimizer estimated row count:        14158.00
	optimizer estimated cost:         3842.48

Source result set:
	Project-Restrict ResultSet (3):
	Number of opens = 1
	Rows seen = 141619
	Rows filtered = 0
	restriction = false
	projection = true
		constructor time (milliseconds) = 0
		open time (milliseconds) = 0
		next time (milliseconds) = 0
		close time (milliseconds) = 0
		restriction time (milliseconds) = 0
		projection time (milliseconds) = 0
		optimizer estimated row count:        14158.00
		optimizer estimated cost:         3842.48

	Source result set:
		Index Scan ResultSet for MAIL_ITEM using index I_MAIL_ITEM_TAGS_DATE at read committed isolation
level using instantaneous share row locking chosen by the optimizer
		Number of opens = 1
		Rows seen = 141619
		Rows filtered = 0
		Fetch Size = 16
			constructor time (milliseconds) = 0
			open time (milliseconds) = 0
			next time (milliseconds) = 0
			close time (milliseconds) = 0
			next time in milliseconds/row = 0

		scan information: 
			Bit set of columns fetched={0, 1}
			Number of columns fetched=2
			Number of deleted rows visited=150
			Number of pages visited=326
			Number of rows qualified=141619
			Number of rows visited=141769
			Scan type=btree
			Tree height=2
			start position: 
	>= on first 1 column(s).
	Ordered null semantics on the following columns: 

			stop position: 
	> on first 1 column(s).
	Ordered null semantics on the following columns: 

			qualifiers:
None
			optimizer estimated row count:        14158.00
			optimizer estimated cost:         3842.48

Mime
View raw message