db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-3790) Investigate if request for update statistics can be skipped for certain kind of indexes, one instance may be unique indexes based on one column.
Date Fri, 28 Jun 2013 18:24:24 GMT

    [ https://issues.apache.org/jira/browse/DERBY-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13695632#comment-13695632

ASF subversion and git services commented on DERBY-3790:

Commit 1497868 from [~mamtas]
[ https://svn.apache.org/r1497868 ]

DERBY-5680( indexStat daemon processing tables over and over even when there are no changes
in the tables )

Backporting the 3 commits that went in for DERBY-5680 to 10.8. The 3 commits were 1340549,
1341622, 1341629. The first two commits were easy to backport using svn merge command but
the third commit 1341629 ran into conflicts. For that backport, hand made the changes since
there were not too many changes.

The changes for this jira has added a new property derby.storage.indexStats.debug.keepDisposableStats.
The intention of the property is that if the property is set to true, we do not delete the
orphaned/disposable stats. If the property is set to false, the orphaned/disposable stats
will get dropped by the index stats daemon. Currently known reasons for orphaned/disposable
stats are
1)DERBY-5681(When a foreign key constraint on a table is dropped, the associated statistics
row for the conglomerate is not removed). Fix for this has been backported all the way to
2)DERBY-3790(Investigate if request for update statistics can be skipped for certain kind
of indexes, one instance may be unique indexes based on one column.) Fix for this is in 10.9
and higher

A junit test was added for this new property but it went in as part of DERBY-3790. The name
of the junit test is store.KeepDisposableStatsPropertyTest. Had to make changes to this test
to backport it to 10.8 but without the fix for DEBRY-3790 and with the absence of drop statistics
procedure, the test really does not make much sense for 10.8 codeline. The test uses drop
statistics procedure and it is mainly testing DERBY-3790 to make sure that the orphaned stats
are being deleted or left behind based on whether the property is set to true or false. But
since we do not have drop statistics procedure and we do not have DERBY-3790 fixed in 10.8,
we can't really meaningfully run the KeepDisposableStatsPropertyTest in 10.8. In any case,
I have changed the test so that atleast it will not fail in 10.8 but it is not able to truly
test the property. May be we can test this property through upgrade suite where we will create
orphaned stats because of DERBY-5681 on older releases and we will find that when the property
is set to true, even after upgrade, we will have orphaned stats but when property is set to
false, after upgrade, orphaned stats are deleted.
> Investigate if request for update statistics can be skipped for certain kind of indexes,
one instance may be unique indexes based on one column.
> ------------------------------------------------------------------------------------------------------------------------------------------------
>                 Key: DERBY-3790
>                 URL: https://issues.apache.org/jira/browse/DERBY-3790
>             Project: Derby
>          Issue Type: Improvement
>          Components: Store
>    Affects Versions:
>            Reporter: Mamta A. Satoor
>            Assignee: Kristian Waagan
>             Fix For:
>         Attachments: derby-3790-1a-skip_stats_scui.diff, derby-3790-1b-skip_stats_scui.diff,
derby-3790-1c-skip_stats_scui.diff, derby-3790-2a-minor_test_improvements.diff
> DERBY-269 provided a manual way to update the statisitcs. There was some discussion in
that jira entry for possibly optimizing the cases where there is no need to update the statistics.
I will enter the related comments from that jira entry here for reference.
> **************************
> Knut Anders Hatlen - 18/Jul/08 12:39 AM 
> If I have understood correctly, unique indexes always have up to date cardinality statistics
because cardinality == row count. If that's the case, one possible optimization is to skip
the unique indexes when SYSCS_UPDATE_STATISTICS is called. 
> **************************
> **************************
> Mike Matrigali - 18/Jul/08 09:48 AM 
> is the cardinality of a unique index 1 or is it row count? 
> It is also more complicated than just skipping unique indexes, it depends on the number
of columns in the index because 
> in a multi-column index, multiple cardinalities are calculated. So for instance on an
index on columns A,B,C there are 
> actually 3 cardinalities calculated: 
> A 
> A,B 
> A,B,C 
> I agree that the calculation of cardinality of A,B,C could/should be short circuited
for a unique index. 
> **************************
> **************************
> Knut Anders Hatlen - 18/Jul/08 03:25 PM 
> Mike, 
> It looks to me as if the cardinality is the number of unique values, so I think the cardinality
of a unique index is equal to its row count (for the full key, that is). You're right that
we can't short circuit it if we have a multi-column index. I don't know if it's worth the
extra complexity to short circuit the A,B,C case, since we'd have to scan the entire index
anyway. For a single-column unique index it sounds like a good idea, though. 
> **************************

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message