db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dyre Tjeldvoll (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2168) Create new row format for derby to optimize access to columns within a row
Date Mon, 18 Dec 2006 13:14:21 GMT
    [ http://issues.apache.org/jira/browse/DERBY-2168?page=comments#action_12459314 ] 
            
Dyre Tjeldvoll commented on DERBY-2168:
---------------------------------------

A big project indeed. I assume that this change will cause a change in the disk format, and
with it various upgrade issues? Perhaps also a major version bump?

> Create new row format for derby to optimize access to columns within a row
> --------------------------------------------------------------------------
>
>                 Key: DERBY-2168
>                 URL: http://issues.apache.org/jira/browse/DERBY-2168
>             Project: Derby
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions: 10.3.0.0
>            Reporter: Mike Matrigali
>            Priority: Minor
>
> The current (and only) low level row format for derby was chosen to at the beginning
of the project to be the most flexible.  So it treats every
> column as variable length.  The simple row format is just a sequence of columns, with
each column having a header indicating how long it
> is.  So there is  no way to determine where the N'th column is in the row unless it first
traverses the N-1 columns before
> it.  A number of queries that might benefit from a different row format include:
> 1) non-covered queries which don't require all columns of data
> 2) non index scans which disqualify a number of rows based on a subset of columns that
don't happen to be the 1st N columns of the row.
> A pretty standard row format would have some sort of table at the beginning which would
allow one to jump to a given offset of the row without
> going through all the other columns.  Building up this table would likely increase the
insert cost slightly, and would increase the diskspace required
> to store rows.
> Another standard kind of row format would be to optimize the  storage of fixed length
fields.  Currently the store does not know anything about fixed
> length fields as each datatype controls it's own storage.  New interfaces could be added
either at create time or maybe in the datatypes themselves
> to export the knowledge that datatypes are fixed length.  
> This is a big project.  Note that a lot of performance work in StoredPage has made it
"know" about the current record and field formats, as it was 
> a big performance hit to make class calls for every field traversal.  This means that
adding a new record and/or field format is not as isolated as
> one might hope.  Also we are likely to need to support both the old and new format. 
Anyone considering this work, I would suggest a very rough
> prototype with peformance measurement first to make sure you are getting the expected
performance before  doing a lot of work.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message