db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DERBY-5752) LOBStreamControl should materialize less aggressively
Date Wed, 09 May 2012 16:47:49 GMT

     [ https://issues.apache.org/jira/browse/DERBY-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Knut Anders Hatlen updated DERBY-5752:

    Attachment: buffsize.diff

I experimented with the attached buffsize.diff patch (not for commit), which simply sets the
buffer size to LOBStreamControl.DEFAULT_MAX_BUF_SIZE (4K) regardless of the size of the LOB.
With that patch, the heap requirement for TriggerTest, which works on some large LOBs that
get materialized, went down from 110MB to 85MB. (I ran this experiment on sources that had
been patched for DERBY-5751, which already had reduced the heap requirements from 140MB to
110MB.) suites.All also passed with these changes.

This is not the approach I'm planning to take in the final patch. I intend to make the buffer
size dynamic based on the LOB size, as it is today, but have a maximum size of 32KB.

Since the constructor in question is only used when the underlying LOB value is materialized
before the EmbedBlob is instantiated, and LOBs larger than 32KB are typically not materialized
at that point, I think the suggested approach would only affect the cases where

  - the LOB is used in a trigger

  - getBytes() has been called on the column before getBlob() (this case was in fact something
we considered disallowing in DERBY-5489)

In those two cases, if the LOB is larger than 32KB, the LOBStreamControl instance will overflow
to temporary files instead of buffering the entire LOB in memory, if the suggested approach
is implemented.
> LOBStreamControl should materialize less aggressively
> -----------------------------------------------------
>                 Key: DERBY-5752
>                 URL: https://issues.apache.org/jira/browse/DERBY-5752
>             Project: Derby
>          Issue Type: Improvement
>          Components: JDBC
>    Affects Versions:
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>         Attachments: buffsize.diff
> The constructor LOBStreamControl(EmbedConnection, byte[]) always makes the buffer size
equal to the LOB size, effectively creating an extra, fully materialized copy of the LOB in
> I think the assumption here is that a LOB that's already materialized is a small one.
That is, LOBs that are smaller than 32 KB and fit in a single page are typically materialized
when read from store. However, we sometimes materialize LOBs that are a lot bigger than 32
KB. For example, triggers that access LOBs may materialize them regardless of size (see comment
in DMLWriteResultSet's constructor for details). For these large LOBs, it sounds unreasonable
to allocate a buffer of the same size as the LOB itself.
> I'd suggest that we change the constructor so that it never allocates a buffer larger
than 32KB. That would mean that the behaviour is preserved for all LOBs fetched directly from
store (only LOBs that don't fit in a single page will cause temporary files to be created),
whereas we'll prevent large LOBs accessed by triggers from being duplicated in memory by overflowing
to temporary files.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message