db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Hillegas (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-5443) reduce number of times sequence updater does it work on user thread rather than nested user thread.
Date Tue, 06 Mar 2012 20:49:59 GMT

    [ https://issues.apache.org/jira/browse/DERBY-5443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223644#comment-13223644

Rick Hillegas commented on DERBY-5443:

Here are some more thoughts about how to tackle this problem.

With the existing code, the problems are probably worse for identity values than for sequence
values. This is because of the different usage patterns for the catalogs which hold the values.
The corresponding catalogs are SYSCOLUMNS and SYSSEQUENCES.

The following operations acquire locks on SYSSEQUENCES:

1) Creating and dropping sequences.

2) Getting the next value out of a sequence.

3) Peeking at the next sequence value to be allocated. Note that because of pre-allocation,
this operation doesn't really work. Users are suprised by what comes back when they query

4) Flushing unused, pre-allocated values at orderly engine shutdown.

The locking situation for SYSCOLUMNS is much more tricky. In addition to the use-cases above,
the DataDictionary reads SYSCOLUMNS via many unexpected paths. I don't have a comprehensive
list of the special cases when information you think should be cached really isn't, resulting
in ad hoc probes of SYSCOLUMNS. I'm sure this part of the DataDictionary could be improved.
On the other hand, SYSCOLUMNS is one of the 4 special core catalogs whose usage probably will
always be tricky.

Here is the outline of a proposal for solving the problem posed by this JIRA:

A) Create a special, invisible, indexed conglomerate like the one we use to store database
properties. By "invisible" I mean that users would not be able to query it directly. It would
be even more invisible than the properties conglomerate because there would be no way that
a user transaction could ever acquire a lock involving this conglomerate. Let's call this
special conglomerate INVISIBLE_CONGLOMERATE. Its tuples would have the following shape:

    ( UUID, Formatable )

B) At boot time, the DataDictionary would create a special transaction for all work on INVISIBLE_CONGLOMERATE.
This transaction would only be used for work on INVISIBLE_CONGLOMERATE. It would not be affected
by the user's isolation level or by user-initiated commits and rollbacks. Let's call this
transaction IC_TRAN.

C) The current values of sequences and identities would be maintained in INVISIBLE_CONGLOMERATE

D) The SequenceGenerators would do all of their work against INVISIBLE_CONGLOMERATE, using
IC_TRAN. This includes initialization, pre-allocation, and flushing at shutdown. The pre-allocation
method would always commit IC_TRAN before returning.

E) We would introduce a new system function for returning the next value to be issued by a
SequenceGenerator. Note that this would be accurate only if you were the only user of the
SequenceGenerator. The signature of this function would be:

   bigint syscs_util.syscs_current_value( uuid char( 36 ) )

F) For backward compatibility, the SQL interpreter would replace references to SYSSEQUENCES.CURRENTVALUE
and SYSCOLUMNS.AUTOINCREMENTVALUE with references to syscs_util.syscs_current_value( syssequences.sequenceid
) and syscs_util.syscs_current_value( syscolumns.referenceid ).

G) ALTER TABLE would change in a way which is not backward compatible. To use ALTER TABLE
to change the current value of an identity column, your transaction would have to be in autocommit
mode. Behind the scenes, Derby would use IC_TRAN to update the value in INVISIBLE_CONGLOMERATE.
This backward compatibility problem seems minor to me but other people may disagree.

H) In the future, it is possible that we may think up other uses for INVISIBLE_CONGLOMERATE
and IC_TRAN, cases where Derby needs to make quick, durable changes not blockable by other
in-flight transactional work.

I would be interested in people's reactions to this proposal:

i) Are there holes in it which I am not seeing?

ii) Are we comfortable with the backward compatibility problem described above in (G)?


> reduce number of times sequence updater does it work on user thread rather than nested
user thread.
> ---------------------------------------------------------------------------------------------------
>                 Key: DERBY-5443
>                 URL: https://issues.apache.org/jira/browse/DERBY-5443
>             Project: Derby
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions:
>            Reporter: Mike Matrigali
>            Priority: Minor
> Currently the Sequence updater tries to do the system catalog update as part of the user
thread, but in a nested user transaction.  When this works
> all is well as the nested user transaction is immediately committed and thus the throughput
of all threads depending on allocating sequences is
> optimized.  
> In order to be able to commit the nested writable transaction independently the lock
manager must treat the parent and nested transactions as two
> independent transactions and locks held by the parent will thus block the child.  And
in effect any lock that is blocked by the parent is a deadlock,
> but the lock manager does not understand this relationship and thus only will timeout
and not recognize the implicit deadlock.
> Only 2 cases come to mind of the parent blocking the child in this manner for sequences:
> 1) ddl like create done in transaction followed by inserts into the table requiring sequence
> 2) users doing jdbc data dictionary lookups in a multistatment transaction resulting
in holding locks on the system catalog rows and subsequently
>     doing inserts into the table requiring sequence updates.
> The sequence updater currently never waits for a lock in the nested transaction and assumes
any blocked lock is this parent deadlock case.  It
> then falls back on doing the update in tranaction and then the system catalog lock remains
until the user transaction commits which could then
> hold hostage all other inserts into the table.  This is ok in the above 2 cases as there
is not any other choice since the user transaction is already
> holding the system hostage.  
> The problem is the case where it was not a deadlock but just another thread trying to
do the sequence update.  In this case the thread should
> not be getting locks on the user thread.  
> I am not sure best way to address this project but here are some ideas:
> 1) enhance lock manager to recognize the deadlock and then change to code to somehow
do an immediately deadlock check for internal 
>     nested transactions, no matter what the system default is.  Then the code should
go ahead and use the system wait timeout on this lock
>     and only fall over to using user transaction for deadlock (or maybe even throw a
new "self deadlock" error that would only be possible for
>     internal transactions).
> 2) somehow execute the internal system catalog update as part of a whole different transaction
in the system.   Would need a separate context.
>     Sort of like the background daemon threads.  Then no self deadlock is possible and
it could just go ahead and wait.  The downside is that then
>     the code to "wait" for a new sequence becomes more complicated as it has to wait
for an event from another thread.  But seems like it could
>     designed with locks/synchonization blocks somehow.  
> 3) maybe add another lock synchronization that would only involve threads updating the
sequences.  So first an updater would request the
>     sequence updater lock (with a key specific to the table and a new type) and it could
just wait on it.  It should never be held by parent
>     transaction.  Then it would still need the catalog row lock to do the update.  I
think with proper ordering this would insure that blocking on
>     the catalog row lock would only happen in the self deadlock case.  
> Overall this problem is less important as the size of the chunk of sequence is tuned
properly for the application, and ultimately best if derby
> autotuned the chunk.  There is a separate jira for auto tuning: DERBY-5295

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message