db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dag H. Wanvik (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4771) Continue investigation of automatic creation/update of index statistics
Date Wed, 01 Dec 2010 22:24:12 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965879#action_12965879
] 

Dag H. Wanvik commented on DERBY-4771:
--------------------------------------

Hi Kristian, some comments on the patch beyond the tests. I still need to look at the meat
in IndexStatisticsDaemonImpl, that will follow in a next review installment. To the extent
my (preliminary) questions will be answered when I have read IndexStatisticsDaemonImpl fully,
please ignore them ;-)

Generally the structure of the patch is clear and seems to do the right things. There are
some amount of TODOs left that need weork as you are no doubt aware, and some of the naming
could use some work, more below.

M       java/engine/org/apache/derby/impl/sql/compile/CursorNode.java

                if (fromTable instanceof FromBaseTable) {
                    TableDescriptor td = fromTable.getTableDescriptor();
                    :
                    // TODO: This was done because I didn't find another way to
                    //       access the base tables in the query at the time I
                    //       needed to. There may be a better way to do this!

* The privilege collection phase seems like a good place to do it to me? Obviates need for
an extra pass, and it's always performed I think. It would be nice to format the block in
"if (fromTable.isPrivilegeCollectionRequired())" with braces and correct indentation while
you are at it, too.

                    if (checkIndexStats &&
                            td.getTableType() == TableDescriptor.BASE_TABLE_TYPE) {

* Isn't td's table type always BASE_TABLE_TYPE here? Cf. "if (fromTable instanceof FromBaseTable)"
above.. No, it seems it can also be a view, cf comment in FromBaseTable's class Javadoc. Confusing..
maybe add a comment. 

                        if (statsToUpdate == null) {
                            statsToUpdate = new ArrayList(2);
                        }
                        statsToUpdate.add(td);
                    }
                }

* Btw, why use "2" as initialCapacity to "new ArrayList"? Is it statistically more likely
we have 2 than any other number (1, 3)? If so, comment, if not, I'd suggest use no-arg constructor.

* // Look for missing and stale statistics.

  The loop seems to be doing the opposite: removing those that are up-to-date? I am not a
big fan of include type (here "flag" in the name of methods & variables); I'd prefer a
name that indicated the semantics of the flag instead, e.g. "getAndClearIsUpToDate". That
reversed meaning removes the need for the "!" in the test and makes it easier to read.

* // Assume a low number of base tables.
  baseTables.remove();

I presume this comment pertains to performance only.

M       java/engine/org/apache/derby/impl/sql/compile/FromBaseTable.java

Parameterize 100 properly with a constant or property? Fix two TODOs in estimateCost.


M       java/engine/org/apache/derby/impl/sql/catalog/DataDictionaryImpl.java

Remove TODO in setIndexStatsRefresher (machinery is temporary, code says).

A       java/engine/org/apache/derby/impl/services/daemon/IndexStatisticsDaemonImpl.java
M       java/engine/org/apache/derby/impl/db/BasicDatabase.java
M       java/engine/org/apache/derby/iapi/sql/dictionary/TableDescriptor.java

* markForIndexStatsUpdate: sdl may see nullpointerexception if tableRowCountEstimate >=
0, cf.
  test: "if (sdl.isEmpty() && tableRowCountEstimate < 0)" doesn't hold. If known
invariant, use ASSERT instead. Javadoc should explain the use of negative tableRowCountEstimate
which seems expected.

The diff must be one order of magnitude as far as i can read the code, for an update to be
triggered. Would be nice to comment this logic a bit. Not sure why you need the first comparison
(against 1000 - has a TODO attached though) when you have the log comparison anyway just beneath?
Only saves som math operations; worth it?  The generated reason string

  "indexStatsUpdateReason = "t-est=" + tableRowCountEstimate +
  ", i-est=" + indexRowCountEstimate + " => cmp=" + cmp;

seems to be comparing apples to oranges (number vs log number)?

* markForIndexStatsUpdate: change comment to Javadoc. Remaining TODO: The statement "the more
accurate table row count estimate may be lost" isn't clear to me.


M       java/engine/org/apache/derby/iapi/sql/dictionary/DataDictionary.java

getIndexStatsRefresher: "if disabled" how can it be disabled? A: See disableIndexStatsRefresher.
When is this one used (scenario) ? Can it be reeenabled, if so, how? No, only on fatal error.
Maybe note in Javadoc what it's for..

doSetIndexStatsRefresher, setIndexStatsRefresher: I'd prefer doCreateIndexStatsRefresher,
createIndexStatsRefresher I think. It just seemed vague to me what the verb "set" meant here.
Since a refresher isn't provided in "setIndexStatsRefresher", I mean, as in a normal getter/setter
pattern.

doSetIndexStatsRefresher -> !indexRefreshedExists

A       java/engine/org/apache/derby/iapi/services/daemon/IndexStatisticsDaemon.java

 *  The background mode will try to
 * affect other operations as little as possible, and errors won't be reported
 * unless they are severe. 

How is this safe? (errors not reported?) explain! What happens if scheduled work throws? Under
which condition can work queue be full? Then what happens? I guess I'll know when I read the
Impl class :)

Javadoc typo for schedule: schedulig


> Continue investigation of automatic creation/update of index statistics
> -----------------------------------------------------------------------
>
>                 Key: DERBY-4771
>                 URL: https://issues.apache.org/jira/browse/DERBY-4771
>             Project: Derby
>          Issue Type: Task
>          Components: SQL, Store
>    Affects Versions: 10.8.0.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>         Attachments: autoindexstats.html, derby-4771-1a-prototype_code_dump.diff, derby-4771-1a-prototype_code_dump.stat,
derby-4771-1b-prototype_code_dump.diff, derby-4771-2a-prototype_lcc_code_dump.diff, derby-4771-2b-prototype_lcc_code_dump.diff,
derby-4771-2c-prototype_lcc_code_dump.diff, derby-4771-2d-prototype_lcc_code_dump.diff, DERBY-4771-2e-prototype.rar,
derby-4771-2e-prototype_lcc_code_dump.diff, Derby-4771-2f-AutomaticIndexStatisticsTest_wondows7.rar,
derby-4771-2f-prototype_lcc_code_dump-WORK-IN-PROGRESS.diff, derby.log, error-stacktrace.out,
rjall.out, rjall.out, rjall.out, rjall.rar, rjone.out
>
>
> Work was started to improve Derby's handling of index statistics. This issue tracks further
discussion and work for this task.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message