db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: bulk insert
Date Fri, 27 Aug 2010 21:16:23 GMT
Rick Hillegas wrote:
> Hi Mike,
> Thanks for the quick response. Some comments inline...
> Mike Matrigali wrote:
>> I would vote -1 on this, if the proposal is to allow unlogged inserts
>> into non-empty tables.  I do not want to add syntax that basically
>> allows users to silent corrupt their db.
> I poorly described what I meant. I am  not suggesting that we invent a 
> new mechanism for bulk insert. I'm merely suggesting that we re-use the 
> existing bulk insert mechanism used by the import procedures. When I 
> said that logging would be turned off, I only meant that it would be 
> turned off in the way that it is for the import procedures. This is my 
> understanding of how that existing logic works:
> o The import procedure cooks up an INSERT statement which 
> specifies"insertMode-bulkInsert". At execution time, if that INSERT 
> statement specifies "insertMode=bulkInsert" AND the table is empty, then 
> a new conglomerate is created on the side and the inserts go into that 
> new conglomerate. As you note, the conglomerate creation is logged and 
> the old conglomerate is replaced with the new conglomerate only if the 
> insert succeeds.

Ok, I don't have a problem if the same mechanism with same existing
behavior is being used, the existing behaviors are safe and should
not lead to corruptions or any need by customer to recover themselves.

If the feature is provided we should note the extra overhead that this
may cause an insert, just so someone doesn't put this on all of their
inserts.  There is overhead for the system to check if the table is
empty before doing the insert.  For import it seems obvious that user 
has gone to trouble to use a different command so likely a empty table
check is not much.  But on a insert statement it is not as obvious.
The worst case for the check is a table that had a large amount of data
that has all been deleted.  The empty check might require reading a 
large number of empty pages - depending on what kind of space 
reclamation has gone on.  I think that is also a benefit of the replace
option, we don't need to do the check.

I am not sure the internal syntax is the best way to go.  I believe it 
was originally eliminated because it was non standard.  Maybe some new
syntax would be more appropriate.

View raw message