db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kristian Waagan (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-3769) Make LOBStoredProcedure on the server side smarter about the read buffer size
Date Tue, 07 Oct 2008 12:15:44 GMT

     [ https://issues.apache.org/jira/browse/DERBY-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Kristian Waagan updated DERBY-3769:

    Attachment: derby-3769-2a-clob_buffer_size_adjustment.diff

Patch 2a adjusts the maximum return size in characters for the CLOB stored procedure to 10890
(DB2_VARCHAR_MAXWIDTH / 3). This potentially results in anything from 10890 to 10890*3 bytes
to be returned to the client in one round-trip, depending on the bytes per char ratio (determined
by the modified UTF8 encoding).

Even though this fix isn't optimal, the advantages outweigh the disadvantages in my opinion.
I did a simple test, where I used a 32K buffer size in the client code to retrieve a 32M chars
long CLOB consisting of CJK chars (3 bytes per char).
With the fix the it took around 17 seconds, without it took almost 3400 seconds! In both cases
a patch for DERBY-3825 was applied.
I also did a test with a 32MB CLOB containing ASCII characters, where I saw a performance
reduction of around 3% (test run on a LAN, performance reduction will increase with higher
latency networks).

If you want to test performance yourself, you must first apply the patch for DERBY-3825 (2a).
The problems are described under DERBY-3766.

Patch ready for review.

> Make LOBStoredProcedure on the server side smarter about the read buffer size
> -----------------------------------------------------------------------------
>                 Key: DERBY-3769
>                 URL: https://issues.apache.org/jira/browse/DERBY-3769
>             Project: Derby
>          Issue Type: Improvement
>          Components: Network Server
>    Affects Versions:,,
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>             Fix For:,
>         Attachments: derby-3769-1a-buffer_size_adjustment.diff, derby-3769-1b-buffer_size_adjustment.diff,
> Derby has a max length for VARBINARY and VARCHAR, which is 32'672 bytes or characters
> When working with LOBs represented by locators, using a read buffer larger than the max
value causes the server to process far more data than necessary.
> Say the read buffer is 33'000 bytes, and these bytes are requested by the client. This
requests ends up in LOBStoredProcedure.BLOBGETBYTES.
> Assume the stream position is 64'000, and this is where we want to read from. The following
>  a) BLOBGETBYTES instructs EmbedBlob to read 33'000 bytes, advancing the stream position
to 97'000.
>  b) Derby fetches/receives the 33'000 bytes, but can only send 32'672. The rest of the
data (328 bytes) is discarded.
>  c) The client receives the 32'672 bytes, recalculates the position and length arguments
and sends another request.
>  d) BLOBGETBYTES(locator, 96672, 328) is executed. EmbedBlob detects that the stream
position has advanced too far, so it resets the stream to position zero and skips/reads until
position 96'672 has been reached.
>  e) The remaining 328 bytes are sent to the client.
> This issue deals with points b) and d), by avoiding the need to reset the stream.
> Points a) and e) are also problematic if a large number of bytes are going to be read,
say hundreds of megabytes, but that's another issue.
> It is unfortunate that using 32 K (32 * 1024) as the buffer size is almost the worst
case; 32'768 - 32'672 = 96 bytes.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message