db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <rick.hille...@oracle.com>
Subject bulk insert
Date Fri, 27 Aug 2010 16:05:51 GMT
Bulk insert is a special mode which causes Derby to repopulate a table 
without logging the rows. Under the hood, bulk insert mode causes the 
interpreter to call TransactionController.recreateAndLoadConglomerate(). 
The import procedures use this special mode to get better performance. 
With logging disabled there is always the risk that something could go 
wrong and the target table could end up corrupted. In the case of such a 
corruption, the user has to drop and recreate the table. I don't know if 
there are any wormholes in this area. I'm not aware of any bugs which 
have been logged because users could not return to an uncorrupted state 
after a failed import.

We get frequent requests to allow users to bulk import data without 
having to use the import procedures. Users want the performance boost of 
unlogged inserts without the performance drag of having to first dump 
the data to a file on the local disk. For instance, users want to be 
able to bulk import from a table function which siphons data out of an 
external data source.

At first blush, this seems like an easy feature to implement. We just 
have to remove a small bit of logic in the parser which prevents 
ordinary insert statements from including one of the bulk insert 
directives. Those directives are just Derby properties:

--DERBY-PROPERTIES insertMode=bulkInsert
--DERBY-PROPERTIES insertMode=replace

I would like to make this change. That is, I would like to

o Allow users to specify the bulkInsert and replace properties on any 
insert statement.
o Document these properties with a warning that the operations are not 
logged and the user is responsible for recovering from errors.

This change will increase the risk that users will have to manually 
recover from failed, corrupting inserts.

I am, however, concerned that bulk insert has been deliberately disabled 
except for its narrow use by the import procedures. Does anyone know why 
this was done? Are there other risks which I should be aware of? Does 
anyone object to broadening this use of Derby's bulk insert feature?


View raw message