Return-Path: Delivered-To: apmail-db-derby-user-archive@www.apache.org Received: (qmail 5003 invoked from network); 20 Mar 2007 22:35:46 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Mar 2007 22:35:46 -0000 Received: (qmail 77795 invoked by uid 500); 20 Mar 2007 22:35:51 -0000 Delivered-To: apmail-db-derby-user-archive@db.apache.org Received: (qmail 77751 invoked by uid 500); 20 Mar 2007 22:35:51 -0000 Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Reply-To: "Derby Discussion" Delivered-To: mailing list derby-user@db.apache.org Received: (qmail 77720 invoked by uid 99); 20 Mar 2007 22:35:51 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2007 15:35:51 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: local policy) Received: from [4.78.240.39] (HELO mta02.zimbra.com) (4.78.240.39) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2007 15:35:39 -0700 Received: from dogfood.zimbra.com (dogfood.liquidsys.com [66.92.25.198]) by mta02.zimbra.com (Postfix) with ESMTP id DAE4E810C38 for ; Tue, 20 Mar 2007 14:29:07 -0800 (PST) Received: from localhost (localhost.localdomain [127.0.0.1]) by dogfood.zimbra.com (Postfix) with ESMTP id DFB66774C3 for ; Tue, 20 Mar 2007 15:35:18 -0700 (PDT) X-Virus-Scanned: amavisd-new at dogfood.zimbra.com Received: from dogfood.zimbra.com ([127.0.0.1]) by localhost (dogfood.zimbra.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OliMC+mbCRww for ; Tue, 20 Mar 2007 15:35:17 -0700 (PDT) Received: from dogfood.zimbra.com (localhost.localdomain [127.0.0.1]) by dogfood.zimbra.com (Postfix) with ESMTP id AB989774C2 for ; Tue, 20 Mar 2007 15:35:17 -0700 (PDT) Date: Tue, 20 Mar 2007 15:35:17 -0700 (PDT) From: Dan Karp To: Derby Discussion Message-ID: <1364605357.92031174430117536.JavaMail.root@dogfood.liquidsys.com> In-Reply-To: <460050AD.8090104@gmail.com> Subject: Re: Queries satisfiable from indexes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [75.23.60.148] X-Virus-Checked: Checked by ClamAV on apache.org > I agree, though, that there is something odd about the number of pages > visited and rows "qualified" here. The top-most result set shows that > only 3 rows were actually returned (I think...correct me if that's > wrong), in which case it seems odd that we actually read 141,512 > "qualified" rows from disk. I don't know much about the data in > question but it seems like we should have done a limited range > scan using the parameter value ("?") as a start and stop key. If that > was the case then we should have scanned far fewer pages/rows. I switched to using DISTINCT instead of GROUP BY and saw better-but-still-not-good performance -- 0.4sec execution time instead of 1.2sec. The cardinality of the TAGS column is 3, 4 if you include NULL. The cardinality of the MAILBOX_ID column is 1 in this case. So I'd expect either a scan of the entire index or some sort of (mailbox_id, tags) prefix scan of the index, but in both cases I'd expect far fewer pages visited. (This is with Derby 10.2.2.0.) QUERY: SELECT DISTINCT(tags) FROM mboxgroup1.mail_item WHERE mailbox_id = ?; INDEX: CREATE INDEX ${DATABASE_NAME}.i_mail_item_tags_date ON mail_item(mailbox_id, tags, date DESC); TABLE DDL: CREATE TABLE ${DATABASE_NAME}.mail_item ( mailbox_id INTEGER NOT NULL, id INTEGER NOT NULL, type SMALLINT NOT NULL, parent_id INTEGER, folder_id INTEGER, index_id INTEGER, imap_id INTEGER, date INTEGER NOT NULL, size INTEGER NOT NULL, volume_id SMALLINT, blob_digest VARCHAR(28), unread INTEGER, flags INTEGER NOT NULL DEFAULT 0, tags BIGINT NOT NULL DEFAULT 0, sender VARCHAR(128), subject VARCHAR(1024), name VARCHAR(128), metadata CLOB, mod_metadata INTEGER NOT NULL, change_date INTEGER, mod_content INTEGER NOT NULL, change_mask INTEGER, PRIMARY KEY (mailbox_id, id), CONSTRAINT fk_mail_item_mailbox_id FOREIGN KEY (mailbox_id) REFERENCES zimbra.mailbox(id), CONSTRAINT fk_mail_item_volume_id FOREIGN KEY (volume_id) REFERENCES zimbra.volume(id), CONSTRAINT fk_mail_item_parent_id FOREIGN KEY (mailbox_id, parent_id) REFERENCES ${DATABASE_NAME}.mail_item(mailbox_id, id), CONSTRAINT fk_mail_item_folder_id FOREIGN KEY (mailbox_id, folder_id) REFERENCES ${DATABASE_NAME}.mail_item(mailbox_id, id) ); PLAN: Number of opens = 1 Rows input = 141619 Rows returned = 3 Eliminate duplicates = true In sorted order = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 optimizer estimated row count: 14158.00 optimizer estimated cost: 3842.48 Source result set: Project-Restrict ResultSet (3): Number of opens = 1 Rows seen = 141619 Rows filtered = 0 restriction = false projection = true constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 restriction time (milliseconds) = 0 projection time (milliseconds) = 0 optimizer estimated row count: 14158.00 optimizer estimated cost: 3842.48 Source result set: Index Scan ResultSet for MAIL_ITEM using index I_MAIL_ITEM_TAGS_DATE at read committed isolation level using instantaneous share row locking chosen by the optimizer Number of opens = 1 Rows seen = 141619 Rows filtered = 0 Fetch Size = 16 constructor time (milliseconds) = 0 open time (milliseconds) = 0 next time (milliseconds) = 0 close time (milliseconds) = 0 next time in milliseconds/row = 0 scan information: Bit set of columns fetched={0, 1} Number of columns fetched=2 Number of deleted rows visited=150 Number of pages visited=326 Number of rows qualified=141619 Number of rows visited=141769 Scan type=btree Tree height=2 start position: >= on first 1 column(s). Ordered null semantics on the following columns: stop position: > on first 1 column(s). Ordered null semantics on the following columns: qualifiers: None optimizer estimated row count: 14158.00 optimizer estimated cost: 3842.48