db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Thalamati <suresh.thalam...@gmail.com>
Subject Re: (DERBY-378) implementing import/export of large objects...
Date Thu, 02 Nov 2006 22:28:21 GMT
Mike Matrigali wrote:
> Suresh Thalamati wrote:
>> Daniel John Debrunner wrote:
>>> Suresh Thalamati wrote:
>>>> BLOB:
>> ....
>>>>   1) Allow import/export of blob data only when they it is  stored 
>>>> in an external file.
>>>>   2) Write the blob data to the export file along with other data 
>>>> types during export, assuming the blob data does not contain any 
>>>> delimiters and throw an error if on import if it finds delimiters 
>>>> inside the data, by interpreting it using the same code-set as the 
>>>> other character data.
>>> I say option 1) and I assume it's for all binary data, not just BLOB, 
>>> e.g. VARCHAR FOR BIT DATA etc. Seems like with binary data the chance 
>>> of having a problematic character in the text file is close to 100%.
>>> Dan.
>> Thanks for reading the proposal, Dan.  I agree with you , chance of 
>> finding  delimiter character inside a binary data is very high. I will 
>> go with the option 1. Your assumption is correct , it applies to all 
>> the binary data.
> I also agree, it seems like a reasonable first step to default binary 
> export/import to external files.
> I am probably missing something here though.  I thought for char data 
> there is an algorithm that handles delimiter data within it.  Why does
> that not work for binary data also?  Is it a codeset issue?

Yes. For character data, double delimiters are used if there are 
delimiter characters inside the data. i.e  say if  a column contains
'he said "derby is solid database" ' , then it is written to the 
export file as "he said ""derby is a solid database "" " . So on 
export of the data, data is modified before writing to the file.

It may be possible to do the same thing by interpreting the binary 
data in specified code-set and  add an extra delimiter for every 
delimiter found on export and do the reverse on import.  But unlike 
the character data , if the binary data is changed and if user import 
it to some other application, the data may mean/look completely 
different if the added extra delimiter characters are not removed.

Another thing to note here is in a character data user knows there 
might be delimiter characters inside and specify a delimiter character 
that is not in the data, if he/she does not want data to modified on 
export. Where as with binary data that is not possible at all.

If you think modifying a binary data on export is ok,  then it might 
be possible to use same concept as character data.  My only concern 
here is if we export image of a "cat"  and add delimiters to it by 
interpreting it as character data, it may look  like a "tiger" if 
extra characters added is not removed :-)


View raw message