Return-Path: X-Original-To: apmail-db-derby-dev-archive@www.apache.org Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9A662574F for ; Tue, 10 May 2011 13:16:28 +0000 (UTC) Received: (qmail 22884 invoked by uid 500); 10 May 2011 13:16:28 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 22864 invoked by uid 500); 10 May 2011 13:16:28 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 22857 invoked by uid 99); 10 May 2011 13:16:28 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 13:16:28 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 13:16:27 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 4EDF136DC4 for ; Tue, 10 May 2011 13:15:48 +0000 (UTC) Date: Tue, 10 May 2011 13:15:48 +0000 (UTC) From: "Knut Anders Hatlen (JIRA)" To: derby-dev@db.apache.org Message-ID: <577862902.116.1305033348319.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1280117901.14169.1298575598513.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (DERBY-5068) Investigate increased CPU usage on client after introduction of UTF-8 CcsidManager MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/DERBY-5068?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:all-tabpanel ] Knut Anders Hatlen updated DERBY-5068: -------------------------------------- Attachment: d5068-2a.stat d5068-2a.diff Attaching an alternative patch (d5068-2a.diff) that must be applied on top of the patch attached to DERBY-5210. The patch makes the following changes: 1) Adds two new methods to CcsidManager: startEncoding() and encode(). These are roughly equivalent to the reset() and encode() methods in java.nio.charset.CharsetEncoder (and Utf8CcsidManager indeed implements them as wrappers around the CharsetEncoder methods). The methods allow encoding a string directly into a ByteBuffer without going via an intermediate throw-away array. 2) Removes these methods from CcsidManager: - convertFromJavaString(String, byte[], int, Agent) - convertToJavaString(byte[]) - maxBytesPerChar() - getByteLength(String) 3) Changes Request, NetPackageRequest and NetConnection to use the new methods instead of the removed ones. In addition to performing the string encoding without creating an intermediate byte array, the patch eliminates the use of getByteLength() completely (that method also created an intermediate byte array). The original code needed to know the exact byte length of the string up front so that it could make sure the destination buffer was large enough. The new interface for encoding the strings lets the caller know if it runs out of buffer space, so that the caller can allocate a larger buffer and continue the operation. This way, we don't need to encode each string twice. The one place where we still need to know the byte length up front, is in NetPackageRequest.buildCommonPKGNAMinfo(). That's because the format of the message depends on whether or not the string length exceeds a certain threshold. The method now creates a byte array representation of the string once, and uses that array both to find the byte length and to copy the encoded version of the string into the buffer. I've rerun the sr_select load client, with 10 threads, to see how this new patch performs. I used JDK 6u24 on Solaris 10, and collected the CPU usage in the client driver by using the /bin/time command. I ran each configuration twice, 10 minutes each. Here's the CPU time per transaction seen with various versions/patches: 10.6.2.1 (plain): 62.4 =C2=B5s/tx 10.8.1.2 (plain): 67.0 =C2=B5s/tx trunk + d5068-1a.diff: 63.4 =C2=B5s/tx trunk + d5210-1a.diff: 67.9 =C2=B5s/tx trunk + d5210-1a.diff + d5068-2a.diff: 65.2 =C2=B5s/tx So, in short: None of the patches bring the CPU usage all the way down to the 10.6.2.1 level. The 1a patch attached to this issue (the one that does the UTF-8 encoding manually) is close, though. The 2a patch doesn't perform quite as well as the 1a patch, but still better than 10.8.1.2. The advantage is that it hides the details on how the encoding is done. Also, by using the standard class library interface, we may benefit from improvements that are made to the class library implementation in the future. I guess I'm leaning towards the approach in the 2a patch. The performance difference isn't that big anyway (I've only been able to see impact on CPU usage, never on the transaction rate), so it doesn't seem worthwhile to duplicate functionality provided by the standard libraries. > Investigate increased CPU usage on client after introduction of UTF-8 Ccs= idManager > -------------------------------------------------------------------------= --------- > > Key: DERBY-5068 > URL: https://issues.apache.org/jira/browse/DERBY-5068 > Project: Derby > Issue Type: Task > Affects Versions: 10.7.1.1 > Reporter: Knut Anders Hatlen > Attachments: d5068-1a.diff, d5068-2a.diff, d5068-2a.stat > > > While looking at the performance graphs for the single-record select test= during the last year - http://home.online.no/~olmsan/derby/perf/select_1y.= html - I noticed that there was a significant increase (10-20%) in CPU usag= e per transaction on the client early in October 2010. To be precise, the i= ncrease seems to have happened between revision 1004381 and revision 100479= 4. In that period, there were three commits: two related to DERBY-4757, and= one related to DERBY-4825 (tests only). > We should try to find out what's causing the increased CPU usage and see = if there's some way to reduce it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira