Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 75005 invoked from network); 12 Dec 2006 17:18:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Dec 2006 17:18:57 -0000 Received: (qmail 65751 invoked by uid 500); 12 Dec 2006 17:19:04 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 65722 invoked by uid 500); 12 Dec 2006 17:19:04 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 65708 invoked by uid 99); 12 Dec 2006 17:19:04 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Dec 2006 09:19:04 -0800 X-ASF-Spam-Status: No, hits=1.9 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [32.97.182.146] (HELO e6.ny.us.ibm.com) (32.97.182.146) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Dec 2006 09:18:51 -0800 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e6.ny.us.ibm.com (8.13.8/8.12.11) with ESMTP id kBCHIv6R002067 for ; Tue, 12 Dec 2006 12:18:57 -0500 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay04.pok.ibm.com (8.13.6/8.13.6/NCO v8.1.1) with ESMTP id kBCHIFFM101938 for ; Tue, 12 Dec 2006 12:18:30 -0500 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id kBCHIF3R014337 for ; Tue, 12 Dec 2006 12:18:15 -0500 Received: from [127.0.0.1] (dyn9-72-133-83.usca.ibm.com [9.72.133.83]) by d01av03.pok.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id kBCHID6d014209 for ; Tue, 12 Dec 2006 12:18:14 -0500 Message-ID: <457EE454.409@sbcglobal.net> Date: Tue, 12 Dec 2006 09:18:12 -0800 From: Mike Matrigali Reply-To: mikem_app@sbcglobal.net User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: derby-dev@db.apache.org Subject: Re: Question about FetchDescriptor.materialized_cols References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org as with most "RESOLVES" and "TODO" still in the code, it has been a long time since anything has happened on them. FetchDescriptor came about from a profiling/performance pass on the code. An array was chosen so that no routine and/or math would be needed to set/clear the info. Int's were chosen as likely to be fasted data structure across all OS/JVM. Most of this work was done to optimize various scans (ie. scans of all cols all qualifying, scans of all cols 1% qualify, scans of some cols all qualifying, scans of some cols 1% qualifying). At the time this was the fastest data structure, faster than bits - and especially faster than bits if number of bits might be large (ie. bigger than 8 or 32 - so that some array reference was necessary). I haven't thought about what it would mean to cache the offset, one would need to have some sort of validation of the value after the latch were released. The case where one needs to find the column again after qualifying (where this code is), is a subsequent update of a column after this row is returned to the caller in a scan as a qualifying row. Note that the "normal" qualifying path for many apps is find a by key in index and then update non-key field of heap - in that case this code is not used. The more interesting issue is the general cost of getting to a column, let me start a separate thread on that. Dyre.Tjeldvoll@Sun.COM wrote: > Today this is an int[] but it appears to be used as a boolean. A > possible explanation for this can be found in a comment in > StoredPage.java: > > // RESOLVE (mikem) - right now value of entry is useless, it > // is an int so that in the future we could cache the offset > // to fields to improve performance of getting to a column > // after qualifying. > materializedCols[col_id] = offset_to_row_data; > > Can someone (mikem?) explain a bit more about this optimization and > what it would take to implement it? What would the potential gains be? > Is it a JIRA for this? >