db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristian Waagan <Kristian.Waa...@Sun.COM>
Subject Re: Error reading CLOB
Date Tue, 11 Jul 2006 15:34:20 GMT
Regunath Balasubramanian wrote:
> Hi,
> I chose to use Derby as an embedded DB to store text parsed/stripped 
> from web pages, MS Office files and PDF documents while implementing an 
> indexing and search solution. I need the parsed text of the document to 
> enable search term highlighting to produce an effective summary of 
> search hits.
> The natural choice was to use the CLOB data type. I store the contents 
> using PreparedStatement.setCharacterStream(column, reader) where reader 
> is a java.io.StringReader constructed from the java.lang.String instance 
> representing the entire parsed contents. I then read the contents out 
> using ResultSet.getClob(column).getCharacterStream().
> This works fine during write always but fails for a few during the read. 
> What surprises me is the fact  that I read and write using the Derby 
> classes and therfore naturally expect that they work. The error is in 
> the of the fillBuffer() method of the UTF8Reader class. It throws a 
> UTFDataFormatException.

Hello Regu,

Could you please tell us in which version(s) of Derby you are seeing 
this problem?

Also, if you have a repro application that can be used to demonstrate 
the problem, it would be great :)
It would be very handy to have the data that causes the 
UTFDataFormatException to be thrown.


> I made a few frustating attempts at trying to get it work - I tried 
> constructing the parsed string using different encodings (UTF-8, 
> ISO-8859-1) at the time of write, tried to read it as a binary stream - 
> failed with a nice exception stating that I was trying to read a CLOB as 
> binary, ascii stream - failed with the same data format exception.
> Finally I decided to write the contents as a BLOB instead. The bytes for 
> writing were constructed using String.getBytes(). I read the contents as 
> Blob.getBytes() and  then construct the String using the new 
> String(byte[]). This works!
> I wonder why the UTF8 reader of Derby failed? I have the above mentioned 
> workaround but would like to know if there is an alternative.
> Cheers!
> Regu
> ------------------------------------------------------------------------
> -----------------------------------------------------------------------------------------------------------------------------
> Disclaimer
> -----------------------------------------------------------------------------------------------------------------------------
> "This message(including attachment if any)is confidential and may be privileged.Before
opening attachments please check them
> for viruses and defects.MindTree Consulting Private Limited (MindTree)will not be responsible
for any viruses or defects or
> any forwarded attachments emanating either from within MindTree or outside.If you have
received this message by mistake please notify the sender by return  e-mail and delete this
message from your system. Any unauthorized use or dissemination of this message in whole or
in part is strictly prohibited.  Please note that e-mails are susceptible to change and MindTree
shall not be liable for any improper, untimely or incomplete transmission."
> -----------------------------------------------------------------------------------------------------------------------------

View raw message