Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 71404 invoked from network); 14 Jul 2006 02:51:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 14 Jul 2006 02:51:19 -0000 Received: (qmail 48550 invoked by uid 500); 14 Jul 2006 02:51:18 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 48512 invoked by uid 500); 14 Jul 2006 02:51:18 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 48501 invoked by uid 99); 14 Jul 2006 02:51:18 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Jul 2006 19:51:18 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [68.142.198.208] (HELO smtp109.sbc.mail.mud.yahoo.com) (68.142.198.208) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 13 Jul 2006 19:51:17 -0700 Received: (qmail 46494 invoked from network); 14 Jul 2006 02:50:49 -0000 Received: from unknown (HELO ?127.0.0.1?) (ddebrunner@sbcglobal.net@75.24.110.42 with plain) by smtp109.sbc.mail.mud.yahoo.com with SMTP; 14 Jul 2006 02:50:49 -0000 Message-ID: <44B70681.4050006@apache.org> Date: Thu, 13 Jul 2006 19:50:41 -0700 From: Daniel John Debrunner User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.3) Gecko/20040910 X-Accept-Language: en-us, en, de MIME-Version: 1.0 To: derby-dev@db.apache.org Subject: Re: Problems in SQLBinary when passing in streams with unknown length (SQL, store) References: <44B6DC75.1020705@Sun.com> In-Reply-To: <44B6DC75.1020705@Sun.com> X-Enigmail-Version: 0.90.0.0 X-Enigmail-Supports: pgp-inline, pgp-mime Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Kristian Waagan wrote: > Hello, > > I just discovered that we are having problems with the length less > overloads in the embedded driver. Before I add any Jiras, I would like > some feedback from the community. There are for sure problems in > SQLBinary.readFromStream(). I would also appreciate if someone with > knowledge of the storage layer can tell me if we are facing trouble > there as well. > > SQL layer > ========= > SQLBinary.readFromStream() > 1) The method does not support streaming. > It will either grow the buffer array to twice its size, or possibly > more if the available() method of the input stream returns a > non-zero value, until all data is read. This approach causes an > OutOfMemoryError if the stream data cannot fit into memory. I think this is because the maximum size for this data type is 255 bytes, so memory usage was not a concern. SQLBinary corresponds to CHAR FOR BIT DATA, the sub-classes correspond to the larger data types. One question that has been nagging me is that the standard response to why the existing JDBC methods had to declare the length was that the length was required up-front by most (some?) database engines. Did this requirement suddenly disappear? I assume it was discussed in the JDBC 4.0 expert group. I haven't looked at your implementation for this, but the root cause may be that derby does need to verify that the supplied value does not exceed the declared length for the data type. Prior to any change for lengthless overloads the incoming length was checked before the data was inserted into the store. I wonder if with your change it is still checking the length prior to storing it, but reading the entire value into a byte array in order to determine its length. > 2) Might enter endless loop. > If the available() method of the input stream returns 0, and the > data in the stream is larger than the initial buffer array, an > endless loop will be entered. The problem is that the length > argument of read(byte[],int,int) is set to 0. We don't read any > more data and the stream is never exhausted. That seems like a bug, available() is basically a useless method. > > To me, relying on available() to determine if the stream is exhausted > seems wrong. Also, subclasses of InputStream will return 0 if they don't > override the method. > I wrote a simple workaround for 2), but then the OutOfMemoryError > comes into play for large data. > > > Store layer > =========== > I haven't had time to study the store layer, and know very little about > it. I hope somebody can give me some quick answers here. > 3) Is it possible to stream directly to the store layer if you don't > know the length of the data? > Can meta information (page headers, record headers etc.) be updated > "as we go", or must the size be specified when the insert is > started? Yes the store can handle this. Dan.