Return-Path: Delivered-To: apmail-db-derby-user-archive@www.apache.org Received: (qmail 16346 invoked from network); 20 Mar 2007 23:14:06 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 20 Mar 2007 23:14:06 -0000 Received: (qmail 46082 invoked by uid 500); 20 Mar 2007 23:14:12 -0000 Delivered-To: apmail-db-derby-user-archive@db.apache.org Received: (qmail 46054 invoked by uid 500); 20 Mar 2007 23:14:12 -0000 Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Reply-To: "Derby Discussion" Delivered-To: mailing list derby-user@db.apache.org Received: (qmail 46043 invoked by uid 99); 20 Mar 2007 23:14:12 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2007 16:14:12 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [32.97.182.141] (HELO e1.ny.us.ibm.com) (32.97.182.141) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2007 16:14:02 -0700 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e1.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id l2KNDa06013490 for ; Tue, 20 Mar 2007 19:13:36 -0400 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v8.3) with ESMTP id l2KNCXkL282150 for ; Tue, 20 Mar 2007 19:12:33 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id l2KNCXAL009420 for ; Tue, 20 Mar 2007 19:12:33 -0400 Received: from [127.0.0.1] (IBM-IKEJ04B1IMA.usca.ibm.com [9.72.133.187]) by d01av03.pok.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id l2KNCWku009406 for ; Tue, 20 Mar 2007 19:12:32 -0400 Message-ID: <46006A4D.1010506@sbcglobal.net> Date: Tue, 20 Mar 2007 16:12:13 -0700 From: Mike Matrigali Reply-To: mikem_app@sbcglobal.net User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: Derby Discussion Subject: Re: Queries satisfiable from indexes References: <1268876681.82631174420332494.JavaMail.root@dogfood.liquidsys.com> In-Reply-To: <1268876681.82631174420332494.JavaMail.root@dogfood.liquidsys.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org I agree with army, Derby can and often does satisfy queries from indexes alone (I usually refer to this as covered query). If you can't provide a reproducible case for us to run, some of the following may help us guess what is going on: 1) exact ddl of the tables 2) size estimate of data for any data columns that are variable length (ie. most likely vchar fields), from this we can roughly estimate size of keys and see if 161,000 rows in 500 leaf pages makes sense. Assuming 4k index page, then this would be about 12 bytes a key, which seems reasonable for a 3 part key + a pointer to base table. 3) anything you can tell us about data distribution. 4) page size of tables/indexes. My reading of this is that the query for whatever value you put in for ? returned ~141,000 rows matching the value of mailbox_id that you put in, and there are only 3 distinct values of mailbox_id, tags in the dataset. It actually had to look through 161970 rows including the deleted rows that have not been reclaimed yet. Dan Karp wrote: > I've got a few queries that should be satisfiable from an index without any access to the underlying rows. Can Derby do this, or does it load the rows regardless? In this case, would it help things if I'd used DISTINCT(tags) instead of a GROUP BY clause? > > > QUERY: > > SELECT tags FROM mboxgroup1.mail_item > WHERE mailbox_id = ? > GROUP BY mailbox_id, tags; > > > INDEX: > > CREATE INDEX ${DATABASE_NAME}.i_mail_item_tags_date > ON mail_item(mailbox_id, tags, date); > > > PLAN: > > Number of opens = 1 > Rows seen = 3 > Rows filtered = 0 > restriction = false > projection = true > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > restriction time (milliseconds) = 0 > projection time (milliseconds) = 0 > optimizer estimated row count: 14134.80 > optimizer estimated cost: 5049.44 > > Source result set: > Sort ResultSet: > Number of opens = 1 > Rows input = 141512 > Rows returned = 3 > Eliminate duplicates = true > In sorted order = true > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > optimizer estimated row count: 14134.80 > optimizer estimated cost: 5049.44 > > Source result set: > Project-Restrict ResultSet (3): > Number of opens = 1 > Rows seen = 141512 > Rows filtered = 0 > restriction = false > projection = true > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > restriction time (milliseconds) = 0 > projection time (milliseconds) = 0 > optimizer estimated row count: 14134.80 > optimizer estimated cost: 5049.44 > > Source result set: > Index Scan ResultSet for MAIL_ITEM using index I_MAIL_ITEM_TAGS_DATE at read committed isolation level using share row locking chosen by the optimizer > Number of opens = 1 > Rows seen = 141512 > Rows filtered = 0 > Fetch Size = 1 > constructor time (milliseconds) = 0 > open time (milliseconds) = 0 > next time (milliseconds) = 0 > close time (milliseconds) = 0 > next time in milliseconds/row = 0 > > scan information: > Bit set of columns fetched={0, 1} > Number of columns fetched=2 > Number of deleted rows visited=20458 > Number of pages visited=501 > Number of rows qualified=141512 > Number of rows visited=161970 > Scan type=btree > Tree height=3 > start position: > >= on first 1 column(s). > Ordered null semantics on the following columns: > > stop position: > > on first 1 column(s). > Ordered null semantics on the following columns: > > qualifiers: > None > optimizer estimated row count: 14134.80 > optimizer estimated cost: 5049.44 > > >