db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-5752) LOBStreamControl should materialize less aggressively
Date Tue, 12 Feb 2013 14:51:13 GMT

    [ https://issues.apache.org/jira/browse/DERBY-5752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576651#comment-13576651
] 

Knut Anders Hatlen commented on DERBY-5752:
-------------------------------------------

I had forgotten about this...

Now when I rerun the tests, I am not able to reproduce the big difference I saw in BlobClob4BlobTest
in the first test run. I do still see a difference, but it's more like 165 seconds vs 180
seconds for the full BlobClob4BlobTest. As before, it looks like the entire difference is
caused by testPositionAgressive() in an encrypted database, which slowed down from 7 seconds
to 23 seconds in my environment. There is no difference in that test case on unencrypted databased.

The test case in question inserts a number of CLOBs, some of which are greater than the 32k
limit for materialization, into a table. However, the query that reads the CLOBs is ordered
on one of the non-CLOB columns, and the sorting materializes all the columns in the result.
It eventually scans through the fetched CLOBs using Clob.position().

The performance difference is seen because the java.sql.Clob objects fetched from the result
set are no longer fully materialized in memory with the patch, unless they are smaller than
32k. For the big objects, this means that each call to Clob.position() will have to read temporary
files and decrypt the contents in order to search for the substring. Without the patch, the
entire value would live unencrypted in memory, which makes position() a much cheaper operation.

I think this is an expected difference, and that it is acceptable since the CLOB wasn't supposed
to be materialized in this scenario in the first place. Of course, the current limit for materialization
might not be optimal for all applications, as materialization indeed could improve performance
of some operations if the system has enough memory. Increasing the limit or making it tunable
might be a useful improvement, but it's outside the scope of this issue.
                
> LOBStreamControl should materialize less aggressively
> -----------------------------------------------------
>
>                 Key: DERBY-5752
>                 URL: https://issues.apache.org/jira/browse/DERBY-5752
>             Project: Derby
>          Issue Type: Improvement
>          Components: JDBC
>    Affects Versions: 10.9.1.0
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>         Attachments: buffsize.diff, d5752-1a.diff
>
>
> The constructor LOBStreamControl(EmbedConnection, byte[]) always makes the buffer size
equal to the LOB size, effectively creating an extra, fully materialized copy of the LOB in
memory.
> I think the assumption here is that a LOB that's already materialized is a small one.
That is, LOBs that are smaller than 32 KB and fit in a single page are typically materialized
when read from store. However, we sometimes materialize LOBs that are a lot bigger than 32
KB. For example, triggers that access LOBs may materialize them regardless of size (see comment
in DMLWriteResultSet's constructor for details). For these large LOBs, it sounds unreasonable
to allocate a buffer of the same size as the LOB itself.
> I'd suggest that we change the constructor so that it never allocates a buffer larger
than 32KB. That would mean that the behaviour is preserved for all LOBs fetched directly from
store (only LOBs that don't fit in a single page will cause temporary files to be created),
whereas we'll prevent large LOBs accessed by triggers from being duplicated in memory by overflowing
to temporary files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message