db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kristian Waagan (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-4241) Improve transition from read-only to writable Clob representation
Date Tue, 22 Jun 2010 13:58:54 GMT

     [ https://issues.apache.org/jira/browse/DERBY-4241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Kristian Waagan updated DERBY-4241:
-----------------------------------

    Attachment: better.txt
                derby-4241-32core-cmt.txt

Hi Knut,

I ran a series of tests, but it's a long time ago... I was also working on some statistical
analysis at that time, which hasn't made it into the Derby repos (I'm not sure they can).

You can see the results from one of the runs in 'derby-4241-32core-cmt.txt', and the file
'better.txt' is just a grep on a series of such results.

I saw up to ~65% improvement with the patch on the machines with the slowest CPUs. I believe
that the benefit will be greater the larger the CLOB is (the tests used 15 MB CLOBs, I think).
The conclusions are based on time measurements and confidence intervals (obtained using a
technique called bootstrapping) for both the mean and the standard deviation. Therefore, in
some cases the conclusion was "indecisive", even though looking at only the means (from a
series of runs) indicated an improvement.
Now, since this is so long ago, please don't ask too many detailed questions ;) Also, since
I'm no statistician, I cannot guarantee anything about the results I present...

A small glossary:
meanP = mean point
meanP 2sd = the difference between the mean points are at least two times the standard deviation
meanP 3sd = the difference between the mean points are at least three times the standard deviation
meanHL 3sd = the high estimate of PATCHED lies at least three times the standard deviation
away from the low estimate of BASE

I cannot remember which value I used for the standard deviation, but I guess it was the point
value.

> Improve transition from read-only to writable Clob representation
> -----------------------------------------------------------------
>
>                 Key: DERBY-4241
>                 URL: https://issues.apache.org/jira/browse/DERBY-4241
>             Project: Derby
>          Issue Type: Improvement
>          Components: JDBC
>    Affects Versions: 10.5.1.1, 10.6.1.0
>            Reporter: Kristian Waagan
>            Assignee: Kristian Waagan
>            Priority: Minor
>         Attachments: better.txt, derby-4241-1a-InternalClob.getLengthIfKnown.diff, derby-4241-2a-utf8AwareCopy.diff,
derby-4241-32core-cmt.txt
>
>
> When a store stream Clob is going to be modified, it will be written out to the temporary
area of Derby and represented as a TemporaryClob.
> The transfer of the data is done in a sub-optimal manner for two reasons;
>  o for transfer of the complete Clob, the copy method operates on the byte level and
we're not able to save the character length.
>  o for transfer of parts of the Clob (i.e. truncation), we have to first decode the UTF-8
encoding to find the byte count and then transfer the same bytes.
> I intend to do the following two changes;
>  1) Add a getCharLengthIfKnow-method to InternalClob.
>  2) Add a UTF-8 aware copy method to LOBStreamControl.
> When a complete Clob is to be copied, code like this will be executed;
>   cachedCharLength = internalClob.getLengthIfKnown();
>   if (cachedCharLength > 0)
>       // use existing byte-oriented copy method for best performance (copy until EOF)
>   else
>       cachedCharLength = control.copyUTF8Data()
> When parts of a Clob is to be copied, we always use the UTF-8 aware copy method, but
we also do a cheap range check.
>   cachedCharLength = internalClob.getLengthIfKnown();
>   if (cachedCharLength > 0 && requestedLength > cachedCharLength)
>       throw EOFException();
>   if (cachedCharLength == requestedLength)
>      // use existing byte-oriented copy method for best performance (copy until EOF)
>   else
>       cachedCharLength = control.copyUTF8Data(requestedLength);
> Adding the UTF-8 aware copy method was started under DERBY-4023, including comments on
the first revision of a patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message