db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-2911) Implement a buffer manager using java.util.concurrent classes
Date Tue, 06 Nov 2007 15:49:50 GMT

    [ https://issues.apache.org/jira/browse/DERBY-2911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12540462
] 

Knut Anders Hatlen commented on DERBY-2911:
-------------------------------------------

Now I have tested the buffer manager and background cleaner on a larger
database. I used the test client from DERBY-1961 and turned up the
table size so that the table took ~14GB and the index ~2GB. I ran the
test on a dual-CPU AMD Opteron with 2 GB RAM, two SCSI disks (one for
log and one for data) with the write cache disabled. 

With single-record select operations (random primary key lookups)
there was no difference between the two buffer managers, and there was
no difference between running with or without the background
writer. Both observations were as expected (there should be no
difference between the two buffer managers since the performance was
disk bound and they use the same replacement algorithm, and the
background cleaner shouldn't have any effect on read-only load). The
tests were repeated with page cache size 1000 (4MB), 10000 (40MB) and
100000 (400MB), with from 1 up to 100 concurrent connections.

Running with single-record update load (random primary key lookup +
update of string column not part of the primary key), there wasn't any
observable effect of the background cleaner on that load
either. However, when comparing the two buffer manager against each
other, it seemed that Clock consistently had 3-5% higher throughput
than ConcurrentCache when the number of concurrent connections
exceeded 5-10. These results were seen both when the page cache size
was 4MB and when it was 40MB. When it was 400MB, there was no
observable difference.

So for some reason, it seems like the old buffer manager works better
when there's a combination of high eviction rate and many dirty pages
(and the db working set is so large that I/O operations normally go to
disk rather than FS buffer). I found that observation a little
strange, since the performance for this kind of load should be almost
exclusively dependent on the page replacement algorithm, which is
supposed to be identical in the two buffer managers. The only thing I
know is different, is what I mentioned in one of my previous comments:

> 2) When the clock is rotated and the hand sweeps over an unkept, not
> recently used and dirty object, the global synchronization is
> dropped and the object is cleaned. After cleaning the object, the
> global synchronization is re-obtained, but then some other thread
> may have moved the clock hand past the cleaned object while the
> first thread didn't hold the global synchronization lock. In that
> case, the first thread has cleaned an object but is not allowed to
> reuse its entry. Perhaps it's just a theoretical problem, but this
> could mean that some threads have to write multiple dirty pages to
> disk before they are allowed to insert a new one into the cache.
>
> Since the new buffer manager uses synchronization with finer
> granularity, it should avoid this problem by keeping the
> synchronization lock while the object is being cleaned. Then it
> knows that the entry can be reused as soon as the object has been
> cleaned.

So it seems we are hitting the behaviour described above. Because the
writes are likely to go to disk (not only to the file system cache), it
is very likely that the clock hand is moved past the page being
written, so that the cleaned page is not evicted, and the thread that
cleaned it has to keep searching for an evictable page.

To verify that this was the cause of the difference in performance, I
changed ConcurrentCache/ClockPolicy so that it would always continue
the search after it had cleaned a page (that is, the user thread would
clean the page and wait until it had been cleaned, but it would not
evict the page). With this change, the performance of ConcurrentCache
matched Clock in all the above mentioned tests, with no observable
difference. Intuitively, I would say that the change would make the
user threads perform unnecessary work (cleaning pages that they
wouldn't use, at least not until the clock hand has rotated another
full round). However, it seems that there's some pipe-lining effect or
something that makes it more efficient.

I found this observation quite interesting. It's kind of like the user
threads are working as background cleaners for each other. I'll do
some more experimenting and see if I can trick the background cleaner
into doing the same work. I was thinking that instead of forcing the
user thread to wait for the page to be cleaned and then continue, it
could delegate the cleaning to the background cleaner since they won't
use the page anyway and shouldn't have to wait.

> Implement a buffer manager using java.util.concurrent classes
> -------------------------------------------------------------
>
>                 Key: DERBY-2911
>                 URL: https://issues.apache.org/jira/browse/DERBY-2911
>             Project: Derby
>          Issue Type: Improvement
>          Components: Performance, Services
>    Affects Versions: 10.4.0.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>            Priority: Minor
>         Attachments: d2911-1.diff, d2911-1.stat, d2911-2.diff, d2911-3.diff, d2911-4.diff,
d2911-5.diff, d2911-6.diff, d2911-6.stat, d2911-7.diff, d2911-7a.diff, d2911-entry-javadoc.diff,
d2911-unused.diff, d2911-unused.stat, d2911perf.java, perftest6.pdf
>
>
> There are indications that the buffer manager is a bottleneck for some types of multi-user
load. For instance, Anders Morken wrote this in a comment on DERBY-1704: "With a separate
table and index for each thread (to remove latch contention and lock waits from the equation)
we (...) found that org.apache.derby.impl.services.cache.Clock.find()/release() caused about
5 times more contention than the synchronization in LockSet.lockObject() and LockSet.unlock().
That might be an indicator of where to apply the next push".
> It would be interesting to see the scalability and performance of a buffer manager which
exploits the concurrency utilities added in Java SE 5.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message