db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristian Waagan <Kristian.Waa...@Sun.COM>
Subject Re: Tuncation of trailing blanks and lengthless streaming overloads
Date Thu, 22 Jun 2006 19:49:43 GMT
Daniel John Debrunner wrote:
> Kristian Waagan wrote:
>> Hello,
>> I'm working on DERBY-1417; adding new lengthless overloads to the
>> streaming API.  So far, I have only been looking at implementing this in
>> the embedded driver. Based on some comments in the code, I have a few
>> questions and observations regarding truncation of trailing blanks in
>> the various character data types.
>> Type            Trail. blank trunc.     Where
>> ====================================================================
>> CHAR                allowed             SQLChar.normalize
>> VARCHAR             allowed             SQLVarchar.normalize
>> LONG VARCHAR       disallowed           SQLLongVarchar.normalize
>> CLOB                allowed             streaming or
>>                                         SQLVarchar.normalize, depending
>>                                         on source.
>> As can be seen, only data for CLOB is truncated for trailing blanks in
>> the streaming class. We must still read all the data, or so much as we
>> need to know the insertion will fail, but we don't have to store it all
>> in memory.
>> Truncation of trailing blanks is not allowed at all for LONG VARCHAR
>> (according to code comments and bug 5592 - haven't found the place this
>> is stated in the specs).
>> My question is, should we do the truncate check for CHAR and VARCHAR on
>> the streaming level as well?
>> If we don't add this feature, inserting a
>> 10GB file into a VARCHAR field by mistake will cause 10GB to be loaded
>> into memory even though the max size allowed is ~32K, possibly causing
>> out-of-memory errors. The error could be generated at an earlier stage
>> (possibly after reading ~32K +1 bytes).
> I would say its a separate issue to the one you are addressing.
> Applications most likely won't be inserting 10Gb values into
> CHAR/VARCHAR columns using streams as it's not going to work.
> Maybe enter a bug, but doesn't seem it has to be fixed as part of this
> issue.

Handling this as a separate issue is better. The code changes would most 
likely go into the same class as my changes for DERBY-1417. It seems 
like a simple enough change, so I will enter a Jira improvement issue. I 
don't like getting an out-of-memory exception (heap).

>> As far as I can tell, adding this feature is a matter of modifying the
>> 'ReaderToUTF8' class and the
>> 'EmbedPrearedStatement.setCharacterStreamInternal' method.
>> One could also optimize the reading of data into LONG VARCHAR, where one
>> would abort the reading as soon as you can instead of taking it all into
>> memory first. This would require some special case handling in the
>> mentioned locations.
>> Another matter is that streams will not be checked for exact length
>> match when using the lengthless overloads, as we don't have a specified
>> length to compare against.
>> I have a preliminary implementation for setAsciiStream,
>> setCharacterStream  and setClob (all without length specifications) in
>> EmbedPreparedStatement.
> What's the issue with setClob()? The current method doesn't take a
> length. Has java.sql.Clob been changed to not always have a length?

There are/will be 3 setClob methods in PreparedStatement:
  1) setClob(int,Clob)
  2) setClob(int,Reader,long)
  3) setClob(int,Reader)

1) and 2) are already implemented, 3) is a new method added to JDBC. The 
same story goes for the other setXXXStream methods and updateXXXStream 
methods. These are located in PreparedStatement, CallableStatement and 


> Dan.

View raw message