db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DERBY-4437) Concurrent inserts into table with identity column perform poorly
Date Thu, 16 Jun 2011 12:09:47 GMT

     [ https://issues.apache.org/jira/browse/DERBY-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Knut Anders Hatlen updated DERBY-4437:

    Attachment: insertperf2.png

Here's another attempt on a performance test for this improvement. I modified the test to
use a set of five tables, all with an identity column. Each thread inserts one row into each
of the tables and then commits. This is closer to the scenario in which I saw this problem
when I reported the issue. Since each transaction performs multiple inserts, escalating the
locks on the system table from the nested transaction to the parent transaction will have
a higher likelihood of causing contention than in the previous test which committed for every
single insert. Also, since all threads work on the same set of tables, there should be more
lock conflicts in the system table.

This new graph (insertperf2.png) shows the results from the test. As expected, the difference
between 10.8 and trunk is bigger than it was in the previous test, but not dramatically. With
10.8, Derby essentially only allows one thread to run at a time, so adding more threads doesn't
increase the throughput. With trunk, the throughput reaches its maximum at three threads.
That's a bit disappointing, given that the machine has 32 cores, but it might be hitting some
other bottleneck, most likely disk I/O.

For reference, I included results from running the same test without having an identity column
in the tables, to see how well we could expect the test to scale if generating the identity
values was eliminated completely as a bottleneck. That test maxed out around five threads,
so only scaling up to three threads when we have identity columns doesn't sound unreasonable
for this kind of load after all.

I also experimented with the derby.language.identityGeneratorCacheSize property, but that
didn't seem to have any effect on the results (I tried 10, 50, 100, as well as the default

> Concurrent inserts into table with identity column perform poorly
> -----------------------------------------------------------------
>                 Key: DERBY-4437
>                 URL: https://issues.apache.org/jira/browse/DERBY-4437
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions:
>            Reporter: Knut Anders Hatlen
>            Assignee: Rick Hillegas
>         Attachments: D4437PerfTest.java, D4437PerfTest2.java, derby-4437-01-aj-allTestsPass.diff,
derby-4437-02-ac-alterTable-bulkImport-deferredInsert.diff, derby-4437-03-aa-upgradeTest.diff,
insertperf.png, insertperf2.png
> I have a multi-threaded application which is very insert-intensive. I've noticed that
it sometimes can come into a state where it slows down considerably and basically becomes
single-threaded. This is especially harmful on modern multi-core machines since most of the
available resources are left idle.
> The problematic tables contain identity columns, and here's my understanding of what
> 1) Identity columns are generated from a counter that's stored in a row in SYS.SYSCOLUMNS.
During normal operation, the counter is maintained in a nested transaction within the transaction
that performs the insert. This allows the nested transaction to commit the changes to SYS.SYSCOLUMN
separately from the main transaction, and the exclusive lock that it needs to obtain on the
row holding the counter, can be releases after a relatively short time. Concurrent transactions
can therefore insert into the same table at the same time, without needing to wait for the
others to commit or abort.
> 2) However, if the nested transaction cannot lock the row in SYS.SYSCOLUMNS immediately,
it will give up and retry the operation in the main transaction. This prevents self-deadlocks
in the case where the main transaction already owns a lock on SYS.SYSCOLUMNS. Unfortunately,
this also increases the time the row is locked, since the exclusive lock cannot be released
until the main transaction commits. So as soon as there is one lock collision, the waiting
transaction changes to a locking mode that increases the chances of others having to wait,
which seems to result in all insert threads having to obtain the SYSCOLUMNS locks in the main
transaction. The end result is that only one of the insert threads can execute at any given
time as long as the application is in this state.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message