db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2991) Index split deadlock
Date Thu, 29 Jan 2009 16:48:59 GMT

    [ https://issues.apache.org/jira/browse/DERBY-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12668514#action_12668514
] 

Knut Anders Hatlen commented on DERBY-2991:
-------------------------------------------

I've tried to find out which effect the proposed solution has on
nested loop joins. The assumption was earlier that nested loop joins
would turn off bulk fetching and therefore the index scans with
derby.language.bulkFetchDefault=1 would give a good impression of the
overhead imposed by the patch.

This turns out not to be the case. Nested loop joins do in fact use
the default bulk fetch size (16) both for accesses to the outer table
and to the inner table. The inner table/index is accessed through many
shorter scans that frequently return fewer rows than the bulk fetch
size. But even if that causes frequent release of latches, it doesn't
cause copying of the current key, since the scan is done when the
latch is released, and therefore we don't need the position
anymore. So the nested loop joins will actually get much of the
performance benefit observed in single-record primary key selects
mentioned when the preview-1c patch was uploaded.

I inserted some optimizer overrides in the index_join test in
org.apache.derbyTesting.perf.clients.Runner to make the optimizer pick
a nested loop join. With the modified test, the throughput appeared to
increase by about 6% when the patch was applied (average of 30 runs
with each configuration).

So here's what it looks to me as if the performance impact is:

a) Short index scans get increased performance because they don't need
to obtain the scan lock, and because saving the position is not needed
since they complete before they release the latch.

b) Long index scans don't get very much affected by the patch (there's
extra cost in saving keys, but that only happens for every 16th row,
and there's a per page reduction in cost by obtaining the scan lock)

c) Nested loop joins use (b) to scan the outer table and (a) to scan
the inner table, so they should not see any negative impact

d) Long index scans without bulk fetching get lower performance
because they save the position for every qualified row in the index

So the only case identified so far which will get lower performance,
is (d) long index scans without bulk fetching. This kind of scan may
be used by updatable cursors, but I'm not aware of any other kind of
queries that would disable bulk fetching (except when users set
derby.language.bulkFetchDefault, but I don't think that's a very
important case). The overhead appears to be in the range 1%-10%,
depending on the key.

Unless there are other common cases that I haven't thought of, it
sounds like an acceptable overhead to me, given that

a) updatable cursors aren't all that common, and if they are actually
used to update the database, the extra CPU spent by the scan will be
negligible

b) shorter scans see a performance increase in the same order as the
decrease seen by longer no-bulk scans

c) concurrent reads and writes don't deadlock anymore unless there's a
real lock conflict

Comments?

> Index split deadlock
> --------------------
>
>                 Key: DERBY-2991
>                 URL: https://issues.apache.org/jira/browse/DERBY-2991
>             Project: Derby
>          Issue Type: Bug
>          Components: Store
>    Affects Versions: 10.2.2.0, 10.3.1.4
>         Environment: Windows XP, Java 6
>            Reporter: Bogdan Calmac
>            Assignee: Knut Anders Hatlen
>         Attachments: d2991-preview-1a.diff, d2991-preview-1a.stat, d2991-preview-1b.diff,
d2991-preview-1b.stat, d2991-preview-1c.diff, d2991-preview-1c.stat, d2991-preview-1d.diff,
d2991-preview-1d.stat, d2991-preview-1e.diff, derby.log, InsertSelectDeadlock.java, perftest.diff,
Repro2991.java, stacktraces_during_deadlock.txt
>
>
> After doing dome research on the mailing list, it appears that the index split deadlock
is a known behaviour, so I will start by describing the theoretical problem first and then
follow with the details of my test case.
> If you have concurrent select and insert transactions on the same table, the observed
locking behaviour is as follows:
>  - the select transaction acquires an S lock on the root block of the index and then
waits for an S lock on some uncommitted row of the insert transaction
>  - the insert transaction acquires X locks on the inserted records and if it needs to
do an index split creates a sub-transaction that tries to acquire an X lock on the root block
of the index
> In summary: INDEX LOCK followed by ROW LOCK + ROW LOCK followed by INDEX LOCK = deadlock
> In the case of my project this is an important issue (lack of concurrency after being
forced to use table level locking) and I would like to contribute to the project and fix this
issue (if possible). I was wondering if someone that knows the code can give me a few pointers
on the implications of this issue:
>  - Is this a limitation of the top-down algorithm used?
>  - Would fixing it require to use a bottom up algorithm for better concurrency (which
is certainly non trivial)?
>  - Trying to break the circular locking above, I would first question why does the select
transaction need to acquire (and hold) a lock on the root block of the index. Would it be
possible to ensure the consistency of the select without locking the index?
> -----
> The attached test (InsertSelectDeadlock.java) tries to simulate a typical data collection
application, it consists of: 
>  - an insert thread that inserts records in batch 
>  - a select thread that 'processes' the records inserted by the other thread: 'select
* from table where id > ?' 
> The derby log provides detail about the deadlock trace and stacktraces_during_deadlock.txt
shows that the inser thread is doing an index split.
> The test was run on 10.2.2.0 and 10.3.1.4 with identical behaviour.
> Thanks,
> Bogdan Calmac.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message