db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: [jira] Updated: (DERBY-733) Starvation in RAFContainer.readPage()
Date Wed, 14 Dec 2005 21:47:51 GMT
I have reviewed and committed the following patch.  I have the following
comments which weren't enough to hold up this patch:

1) Will you at some point be contributing the performance tests which
    you used to verify this fix, and the original test which showed the
    problem?  I know this is a hard problem, just wondering if you are
    working on solving it.
2) minor nit - could you try to keep lines under 80 chars.
3) I was a little surprised that catch blocks did not do anything
    on a released system.  Then thinking about it, it seemed that ok
    since the added code is not actually using the lock mechanism to
    protect anything, but instead just to schedule the enclosed work
    more fairly.  In fact in jdk1.4 systems the code has to work
    without the lock calls at all.  I have not idea what kind of
    exceptions might happen.  The only case that really concerned me
    is if lock worked but unlock failed - you may want to think about
    that case and see if you should raise an exception.

    May be useful to add comments why no exception handling is necessary.

Knut Anders Hatlen (JIRA) wrote:
>      [ http://issues.apache.org/jira/browse/DERBY-733?page=all ]
> 
> Knut Anders Hatlen updated DERBY-733:
> -------------------------------------
> 
>     Attachment: DERBY-733.diff
> 
> Mike, I agree that a pool of open file descriptors is a good idea, but
> you will run into the same problem with highly unstable response times
> when the number of threads accessing the same file exceeds the number
> of file descriptors. I think we should use ReentrantLock for Java 1.5
> and higher, since the introduction of this class has allowed the
> implementers of JVMs to prioritize throughput over fairness in the
> handling of monitors. We should file a separate enhancement request
> for the file descriptor pool.
> 
> I have attached a patch which invokes ReentrantLock.lock() and
> unlock() when reading in a page from disk. I did not build my own
> ReentrantLock replacement, as I said I would. Instead, I have used
> reflection to enable this feature if the JVM supports it. This seemed
> like an easier approach, and I also discovered that the handling of
> threads waiting for monitors had changed between 1.4 and 1.5 and that
> this issue was not so serious on 1.4.
> 
> The maximum response time was drastically reduced in the disk-bound
> case. Derbyall ran successfully on both Sun JVM 1.4.2 and 1.5.0. I
> have also tested the performance, and I could not see any change in
> throughput or CPU usage. (The performance test was run with a very
> small page cache and with a database that was many times bigger than
> the page cache, but smaller than the file system cache. This way,
> Derby called readPage() very often, but it was CPU-bound since the
> requested page always was in the file system cache.)
> 
> Could someone please review this patch?
> 
> % svn stat
> M      java/engine/org/apache/derby/impl/store/raw/data/RAFContainer.java
> 
> 
>>Starvation in RAFContainer.readPage()
>>-------------------------------------
>>
>>         Key: DERBY-733
>>         URL: http://issues.apache.org/jira/browse/DERBY-733
>>     Project: Derby
>>        Type: Improvement
>>  Components: Performance, Store
>>    Versions: 10.1.2.1, 10.2.0.0, 10.1.3.0, 10.1.2.2
>> Environment: Solaris x86 and Linux with Sun JVM 1.5.0. Derby embedded and client/server.
>>    Reporter: Knut Anders Hatlen
>>    Assignee: Knut Anders Hatlen
>> Attachments: DERBY-733.diff
>>
>>When Derby is completely disk bound, threads might be starved in
>>RAFContainer.readPage(). This is a real problem when multiple clients
>>are repeatedly accessing one or a small number of large tables. In
>>cases like this, I have observed very high maximum response times
>>(several minutes in the worst cases) on simple transactions. The
>>average response time is not affected by this.
>>The starvation is caused by a synchronized block in
>>RAFContainer.readPage():
>>  synchronized (this) {
>>      fileData.seek(pageOffset);
>>      fileData.readFully(pageData, 0, pageSize);
>>  }
>>If many threads want to read pages from the same file, there will be a
>>long queue of threads waiting for this monitor. Since the Java
>>specification does not guarantee that threads waiting for monitors are
>>treated fairly, some threads might have to wait for a long time before
>>they get the monitor. (Usually, a couple of threads get full throughput
>>while the others have to wait.)
> 
> 


Mime
View raw message