db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Matrigali (JIRA)" <derby-...@db.apache.org>
Subject [jira] Updated: (DERBY-1508) improve store costing of fetch of single row where not all columns are fetched.
Date Wed, 12 Jul 2006 21:44:30 GMT
     [ http://issues.apache.org/jira/browse/DERBY-1508?page=all ]

Mike Matrigali updated DERBY-1508:
----------------------------------


One idea is to put part of the estimate back on the datatype.  Have the store ask each datatype
about it's size.  When the
costing work was originally done there was no info in the datatype about it's size available,
but since then I believe support
has been added for estimating at least in memory size of the datatypes - either this or something
similar may be used
now to improve the costing.  Maybe the question is average size, maybe maximum size.  Then
some calculation from this
static info about the datatypes and the average actual row size could come up with some average
estimate of the overflow
lengths of long columns.

> improve store costing of fetch of single row where not all columns are fetched.
> -------------------------------------------------------------------------------
>
>          Key: DERBY-1508
>          URL: http://issues.apache.org/jira/browse/DERBY-1508
>      Project: Derby
>         Type: Improvement

>   Components: Store
>     Reporter: Mike Matrigali
>     Priority: Minor

>
> Currently HeapCostController ignores information about what subset of columns is being
requested.  For instance
> in getFetchFromRowLocationCost() validColumns argument is unused.  In getScanCost() ,
scanColumnList is unused.
> Mostly this probably does no matter as the cost of getting the row dominates the per
column subset cost.  The area
> where this matters is the case of long columns.  The cost of fetching multiple row keyed
by a row location  with a 2 gigabyte column that is not in 
> the select list  is currently way smaller than the cost of doing the same query by scanning
the table for the same set of rows.  
> Currently the heap estimate associates cost with total average row length, gotten by
using the # of rows and the amount of space in the 
> container.  It does not have a statistic available of the average size of each column.
 At it's level  all column lengths are variable and could
> possibly be > 2 gig.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


Mime
View raw message