Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 52773 invoked from network); 28 Nov 2006 17:31:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 28 Nov 2006 17:31:59 -0000 Received: (qmail 97027 invoked by uid 500); 28 Nov 2006 17:32:08 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 96996 invoked by uid 500); 28 Nov 2006 17:32:08 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 96985 invoked by uid 99); 28 Nov 2006 17:32:07 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Nov 2006 09:32:07 -0800 X-ASF-Spam-Status: No, hits=1.9 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [32.97.182.141] (HELO e1.ny.us.ibm.com) (32.97.182.141) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Nov 2006 09:31:54 -0800 Received: from d01relay02.pok.ibm.com (d01relay02.pok.ibm.com [9.56.227.234]) by e1.ny.us.ibm.com (8.13.8/8.12.11) with ESMTP id kASHVTki031652 for ; Tue, 28 Nov 2006 12:31:30 -0500 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay02.pok.ibm.com (8.13.6/8.13.6/NCO v8.1.1) with ESMTP id kASHVTR3296132 for ; Tue, 28 Nov 2006 12:31:29 -0500 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id kASHVTgl020613 for ; Tue, 28 Nov 2006 12:31:29 -0500 Received: from [127.0.0.1] (IBM-IKEJ04B1IMA-009072133083.usca.ibm.com [9.72.133.83]) by d01av03.pok.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id kASHVRNw020374 for ; Tue, 28 Nov 2006 12:31:29 -0500 Message-ID: <456C726B.3000003@sbcglobal.net> Date: Tue, 28 Nov 2006 09:31:23 -0800 From: Mike Matrigali Reply-To: mikem_app@sbcglobal.net User-Agent: Mozilla Thunderbird 1.0 (Windows/20041206) X-Accept-Language: en-us, en MIME-Version: 1.0 To: derby-dev@db.apache.org Subject: Re: ArrayInputStream and performance References: <456B917F.4090706@apache.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Knut Anders Hatlen wrote: > Daniel John Debrunner writes: > > >>Knut Anders Hatlen wrote: >> >> >>>I can't answer your question, but I will add that I find much of the >>>code that uses ArrayInputStream very confusing. ArrayInputStream is >>>supposed to wrap a byte array to provide encapsulation and easy access >>>to the data through the InputStream interface. However, many (most?) >>>of the callers inline the accesses to the data (presumably for >>>performance reasons), so we end up with lots of methods looking like >>>this: >>> >>> public X readSomething(ArrayInputStream ais, byte[] data, int offset...) { >>> // lots of direct manipulation of the byte array >>> // ... >>> // ... >>> // ... >>> // finally, make sure that the state of the stream is brought to a >>> // consistent state: >>> ais.setPosition(offset + numberOfManipulatedBytes); >>> } >> >> >>I could only find one method that looked something like the above: >> >>StoredFieldHeader.readFieldLengthAndSetStreamPosition >> >>Could you provide a list of the others so I can see what the issue is? > > > You are quite right; most of the methods that inline the accesses > don't call setPosition(). I think I must have confused the callers of > ArrayInputStream methods with the internal implementation of > ArrayInputStream. They use a pattern of first copying the instance > variables into local variables, then do the work on the local > variables, and finally write back to the instance variables. While I > find that a little confusing because I would expect the run-time > compiler to be able to perform that kind of optimization itself, > that's a different issue from what is discussed in this thread. I put in a lot of the store inline optimization, regretting the "ugliness" of it at the time - but it was all driven by profiler results at the time. When looking at it you should realize it was tuned to the state of jvm's, probably 5 years ago. Some may not be needed any more, but at the time these kinds of changes helped the cached scan performance of the store by close to 10 times on some hardware/jvm's. > > The mentioned method (readFieldLengthAndSetStreamPosition) seems to be > the one that causes most calls to ArrayInputStream. All of the calls > are to setPosition(). If the boundary checking in setPosition() has an > unreasonably high resource consumption, it is always an option to > check the usages of readFieldLengthAndSetStreamPosition, and if it can > be proven safe to skip the boundary checking, that method could use an > unchecked variant of setPosition(). It would be fine to use an unchecked and/or an ASSERT based check for readFieldLengthAndSetStreamPosition. The "store" module owns this access and is not counting on limit checks to catch anything here. >