db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Thalamati <suresh.thalam...@gmail.com>
Subject Re: (DERBY-378) implementing import/export of large objects...
Date Fri, 03 Nov 2006 22:27:52 GMT
Mike Matrigali wrote:
> 
> 
> Suresh Thalamati wrote:
> 
>> Mike Matrigali wrote:
>>
>>>
>>>
>>> Suresh Thalamati wrote:
>>>
>>>> Daniel John Debrunner wrote:
>>>>
>>>>> Suresh Thalamati wrote:
>>>>>
>>>>>> BLOB:
>>>>
>>>>
>>>>
>>>>
>>>> ....
>>>>
>>>>>>   1) Allow import/export of blob data only when they it is  stored

>>>>>> in an external file.
>>>>>>
>>>>>>   2) Write the blob data to the export file along with other data

>>>>>> types during export, assuming the blob data does not contain any

>>>>>> delimiters and throw an error if on import if it finds delimiters

>>>>>> inside the data, by interpreting it using the same code-set as the

>>>>>> other character data.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I say option 1) and I assume it's for all binary data, not just 
>>>>> BLOB, e.g. VARCHAR FOR BIT DATA etc. Seems like with binary data 
>>>>> the chance of having a problematic character in the text file is 
>>>>> close to 100%.
>>>>>
>>>>>
>>>>> Dan.
>>>>>
>>>>>
>>>>
>>>> Thanks for reading the proposal, Dan.  I agree with you , chance of 
>>>> finding  delimiter character inside a binary data is very high. I 
>>>> will go with the option 1. Your assumption is correct , it applies 
>>>> to all the binary data.
>>>
>>>
>>>
>>>
>>> I also agree, it seems like a reasonable first step to default binary 
>>> export/import to external files.
>>>
>>> I am probably missing something here though.  I thought for char data 
>>> there is an algorithm that handles delimiter data within it.  Why does
>>> that not work for binary data also?  Is it a codeset issue?
>>>
>>
>> Yes. For character data, double delimiters are used if there are 
>> delimiter characters inside the data. i.e  say if  a column contains
>> 'he said "derby is solid database" ' , then it is written to the 
>> export file as "he said ""derby is a solid database "" " . So on 
>> export of the data, data is modified before writing to the file.
>>
>> It may be possible to do the same thing by interpreting the binary 
>> data in specified code-set and  add an extra delimiter for every 
>> delimiter found on export and do the reverse on import.  But unlike 
>> the character data , if the binary data is changed and if user import 
>> it to some other application, the data may mean/look completely 
>> different if the added extra delimiter characters are not removed.
> 
> Again I think the separate file/no delimiter solution is a good first
> approach, I just wanted to understand the issue.  As you point out there
> are multiple usage scenario's here:
> 1) someone has a derby db and wants to export for use into another derby 
> db.
> 2) someone has a derby db and wants to export for use in another 
> application.
> 3) someone has some data from another app and wants to import into derby.
> 
> I think the separate file solution works for #1.  I don't know how well
> it works for option #2 and #3.  But at least for #2 it results in the
> raw data without need to process it.
> 


seperate file/no delimiter solution will also work for 2 & 3 also, if 
the other application export/import data in the same fashion. But I 
guess in reality other application may not follow the same formats.

Do you think it will make it easier to do  2 & 3, if  derby supports 
import/export of the binary data also using the single file, similar 
to the character data ?

if you think it will be useful, Derby can support  the following :

1) Instead of throwing error if user attempts to perform import/export 
of binary data using the existing procedures, they can be modified to 
handle import/export of binary data in the same file as other data.
Binary data is exported using the double delimiters , similar to 
character data and on import extra delimiter characters will be removed.

2) import/export of binary data to a separate file.

If user does not want the the binary data to be modified , then they 
can use the option 2.


Thanks
-suresh

Mime
View raw message