db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-6011) Derby performs very badly (seems to deadlock and timeout) in very simple multi-threaded tests
Date Mon, 21 Jan 2013 11:22:12 GMT

    [ https://issues.apache.org/jira/browse/DERBY-6011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13558697#comment-13558697
] 

Knut Anders Hatlen commented on DERBY-6011:
-------------------------------------------

Thanks, Mike and Mamta. Mike's description of the problem and Mamta's description of fix agree
with my understanding this code.

The set of queries that will use this particular optimizer tweak after the suggested fix (set
A), are those that access a table that has a unique index, and that have equality predicates
for all the columns in that index. All plans which do not use such an index to access that
table, will get their cost adjusted (scan cost is multiplied by 2, scan row count before selectivity
adjustment is incremented by 2).

The set of queries that will see changes in cost estimates for some plans because of the suggested
fix (set B), is the subset of A where the optimizer believes one or more of the scan alternatives
will have to read more than one row. This includes these two categories:

  i) Queries where the row count statistics actually do indicate that multiple rows will have
to be read. That is, the table contains multiple rows.

  ii) Queries where the fix for beetle 5006 makes an empty or almost empty table appear not
so empty. That is, if the alternatives include index scans with no constant start or stop
keys, or full table scans. (Note: parameters are not considered constant start or stop keys,
so "our" query falls into this category.)

The set of queries that will see that a different plan being picked after the fix (set C),
is the subset of B where all of the following conditions apply:

  i) The unique index is not covering, so that an index-to-base-row result set will be needed.

  ii) One of the alternative scans is expected to return < 1 row (after selectivity has
been applied).

  iii) The estimated cost of reading a row from the base table is so high that the difference
between reading less than 1 row and reading exactly 1 row, is greater than the difference
in cost between the unique index lookup and the alternative scan.

Set C should be a lot smaller than set B, as most queries are against non-empty tables. Those
queries should already be using the unique index because the optimizer understands that the
non-unique alternatives will have to fetch at least as many rows from the base table. The
costs of the plans using non-unique scans are still adjusted, but they're already believed
to be more expensive, so it doesn't change the outcome.
                
> Derby performs very badly (seems to deadlock and timeout) in very simple multi-threaded
tests
> ---------------------------------------------------------------------------------------------
>
>                 Key: DERBY-6011
>                 URL: https://issues.apache.org/jira/browse/DERBY-6011
>             Project: Derby
>          Issue Type: Bug
>    Affects Versions: 10.7.1.1, 10.8.2.2, 10.9.1.0
>         Environment: Lenovo laptop with SSD's, Windows 7, 64-bit, Sun JDK 1.6.xx
>            Reporter: Karl Wright
>         Attachments: derby.log, force-specific-index.diff, manifoldcf.log, prefer-unique-index-v1.diff
>
>
> The Apache ManifoldCF project supports Derby as one of its underlying databases.  Simple
tests, however, demonstrate that Derby is apparently deadlocking and timing out repeatedly
under multi-thread conditions.  This problem is long-standing, and is not exhibited by any
other database ManifoldCF supports, and makes a simple test take between 6x and 12x as long.
> There is a trivial test with demonstrates the problem vs. other databases.  Please do
the following (once you have java 1.6+, svn 1.7+, and ant 1.7+ available):
> (1) Check out https://svn.apache.org/repos/asf/manifoldcf/trunk
> (2) Run the following ant target to download the dependencies: "ant make-core-deps"
> (3) Run the Derby test: "ant run-rss-tests-derby" . Note the time required - at least
180 seconds, can be up to 360 seconds.
> (4) Run the equivalent HSQLDB test: "ant run-rss-tests-HSQLDB".  This test takes about
31 seconds to run.
> The output of the Derby test can be found in the directory "tests/rss/test-derby-output".
 Have a look at manifoldcf.log, where all long-running queries are reported.  Derby.log is
also included, which shows only that during the test's cleanup phase the database is deleted
before it is shutdown, which is not pertinent to the performance issue.
> I am available to assist with ManifoldCF, if that seems to be required.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message