db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kristian Waagan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DERBY-3790) Investigate if request for update statistics can be skipped for certain kind of indexes, one instance may be unique indexes based on one column.
Date Mon, 21 May 2012 14:40:42 GMT

     [ https://issues.apache.org/jira/browse/DERBY-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kristian Waagan updated DERBY-3790:
-----------------------------------

    Attachment: derby-3790-1a-skip_stats_scui.diff

Attaching patch 1a, which adds the functionality to skip generating statistics for single-column
unique indexes.

Tests will be attached as a separate patch, as I'm waiting for a commit to avoid conflicts.

 * impl/sql/compile/FromBaseTable and
   iapi/sql/dictionary/TableDescriptor
   Added a new method to count indexes to TD, in which you can specify the minimum number
of ordered column and whether non-unique indexes is exempt from that restriction.
   Used the new method to avoid scheduling work with the istat daemon for tables which has
only one index and that index is a single-column unique index.

 * impl/sql/execute/CreateIndexConstantAction
   Added helper method addStatistics, which tells if statistics should be added for an index.
The pre-10.9 behavior was to add statistics as long as there was at least one row in the index,
the new behavior is to only add statistics if the index has more than one column or is non-unique.
   There's an override available in case this change causes the optimizer to generate bad/worse
plans.
   Removed unused method statementExceptionCleanup.

 * impl/services/daemon/IndexStatisticsDaemonImpl
   Added code to skip generating statistics for single-column unique indexes.

Patch ready for review.
Patch 1a will be committed together with the test patch (not yet posted).
                
> Investigate if request for update statistics can be skipped for certain kind of indexes,
one instance may be unique indexes based on one column.
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3790
>                 URL: https://issues.apache.org/jira/browse/DERBY-3790
>             Project: Derby
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions: 10.5.1.1
>            Reporter: Mamta A. Satoor
>         Attachments: derby-3790-1a-skip_stats_scui.diff
>
>
> DERBY-269 provided a manual way to update the statisitcs. There was some discussion in
that jira entry for possibly optimizing the cases where there is no need to update the statistics.
I will enter the related comments from that jira entry here for reference.
> **************************
> Knut Anders Hatlen - 18/Jul/08 12:39 AM 
> If I have understood correctly, unique indexes always have up to date cardinality statistics
because cardinality == row count. If that's the case, one possible optimization is to skip
the unique indexes when SYSCS_UPDATE_STATISTICS is called. 
> **************************
> **************************
> Mike Matrigali - 18/Jul/08 09:48 AM 
> is the cardinality of a unique index 1 or is it row count? 
> It is also more complicated than just skipping unique indexes, it depends on the number
of columns in the index because 
> in a multi-column index, multiple cardinalities are calculated. So for instance on an
index on columns A,B,C there are 
> actually 3 cardinalities calculated: 
> A 
> A,B 
> A,B,C 
> I agree that the calculation of cardinality of A,B,C could/should be short circuited
for a unique index. 
> **************************
> **************************
> Knut Anders Hatlen - 18/Jul/08 03:25 PM 
> Mike, 
> It looks to me as if the cardinality is the number of unique values, so I think the cardinality
of a unique index is equal to its row count (for the full key, that is). You're right that
we can't short circuit it if we have a multi-column index. I don't know if it's worth the
extra complexity to short circuit the A,B,C case, since we'd have to scan the entire index
anyway. For a single-column unique index it sounds like a good idea, though. 
> **************************

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message