db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Hillegas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-4437) Concurrent inserts into table with identity column perform poorly
Date Fri, 17 Jun 2011 18:04:47 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051229#comment-13051229
] 

Rick Hillegas commented on DERBY-4437:
--------------------------------------

People may want to configure the size of the preallocated ranges for sequences (see DERBY-5151).
Being able to set the preallocation size to 1 will give people the power to eliminate holes
in sequences that occur when you shut down the database and throw away the unused part of
the preallocated range. That in turn, will give people a workaround if they can't tolerate
the holes introduced by the discarded ranges. It may also be useful to tune the size of the
preallocated range depending on how many processors a machine has.

To let people configure the size of preallocated ranges, I propose that we introduce a new
family of Derby properties:

  derby.sequence.cache.size.$UUID=$number

  where

     $UUID is the uuid of a sequence or the uuid of a table with an identity column

     $number is a non-negative number

If this property is not specified, it defaults to a hardcoded number. Currently that number
is 5, but it could be 160 (see Knut's experiments). Maybe the default can be some function
of the number of processors on the machine (if we can figure that out).

The property will be retrieved by PropertyUtil.getServiceProperty() when the generator is
created. This will give it the following behaviors:

1) It can be set at the system, database, and derby.properties levels.

2) It is semi-static. That is, it won't change on the fly if you update the system or database
properties. However, if you change the property and then do something which throws away the
cache, then the new value of the property will be used when the system recreates the cache.
The cache is thrown away at database shutdown and when DDL is run.


> Concurrent inserts into table with identity column perform poorly
> -----------------------------------------------------------------
>
>                 Key: DERBY-4437
>                 URL: https://issues.apache.org/jira/browse/DERBY-4437
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 10.5.3.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Rick Hillegas
>         Attachments: D4437PerfTest.java, D4437PerfTest2.java, derby-4437-01-aj-allTestsPass.diff,
derby-4437-02-ac-alterTable-bulkImport-deferredInsert.diff, derby-4437-03-aa-upgradeTest.diff,
insertperf.png, insertperf2.png, prealloc.png
>
>
> I have a multi-threaded application which is very insert-intensive. I've noticed that
it sometimes can come into a state where it slows down considerably and basically becomes
single-threaded. This is especially harmful on modern multi-core machines since most of the
available resources are left idle.
> The problematic tables contain identity columns, and here's my understanding of what
happens:
> 1) Identity columns are generated from a counter that's stored in a row in SYS.SYSCOLUMNS.
During normal operation, the counter is maintained in a nested transaction within the transaction
that performs the insert. This allows the nested transaction to commit the changes to SYS.SYSCOLUMN
separately from the main transaction, and the exclusive lock that it needs to obtain on the
row holding the counter, can be releases after a relatively short time. Concurrent transactions
can therefore insert into the same table at the same time, without needing to wait for the
others to commit or abort.
> 2) However, if the nested transaction cannot lock the row in SYS.SYSCOLUMNS immediately,
it will give up and retry the operation in the main transaction. This prevents self-deadlocks
in the case where the main transaction already owns a lock on SYS.SYSCOLUMNS. Unfortunately,
this also increases the time the row is locked, since the exclusive lock cannot be released
until the main transaction commits. So as soon as there is one lock collision, the waiting
transaction changes to a locking mode that increases the chances of others having to wait,
which seems to result in all insert threads having to obtain the SYSCOLUMNS locks in the main
transaction. The end result is that only one of the insert threads can execute at any given
time as long as the application is in this state.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message