db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: bulk import issue
Date Tue, 01 Dec 2009 17:49:20 GMT
Are you bulk importing into an empty table?

Derby has a built in optimization that can often be applied when bulk
importing into an empty table.  If the db is not in incremental backup
mode, then if you are bulk importing into an empty table it does not
have to log the changes.  It instead optimizes the abort action to just
empty the table and thus does not need log records.  This is not 
possible if there are rows in the table.

/mikem

Mike Andrews wrote:
> dear derby developers,
> 
> if i bulk import data into a table, i get much better performance if i
> do it in a single SYSCS_UTIL.SYSCS_IMPORT_TABLE statement rather than
> in multiple shots.
> 
> for example, if there are three large files "a.txt", "b.txt", and
> "c.txt", where "c.txt" is just the concatenation of "a.txt" and
> "b.txt", then
> 
> CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'c.txt', ' ', null,null, 1)
> 
> takes much less time than the sum of:
> 
> CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'a.txt', ' ', null,null, 1)
> CALL SYSCS_UTIL.SYSCS_IMPORT_TABLE (null, 'mytable', 'b.txt', ' ', null,null, 0)
> 
> even though they result in exactly the same set of data in the table.
> 
> any ideas why? is there a way to get better performance doing it in
> multiple shots? currently my data is in several text files, and so i
> concatenate them all and run a single SYSCS_UTIL.SYSCS_IMPORT_TABLE
> for best performance.
> 
> best regards,
> mike
> 


Mime
View raw message