db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: Efficient loading of calculated data
Date Thu, 13 Dec 2007 01:35:48 GMT
Stavros Macrakis wrote:
> Hi, I have an application whose output is about 500,000 pairs (string,
> integer) -- this is the result of some fairly fancy text processing.
> I'd like to put this data into a (new) Derby table. Using individual
> Inserts for each row takes over an hour, which seems much too long.
> Using the bulk import feature involves writing out to a file and then
> importing from the file, which seems rather roundabout.
>
> So... What is the recommended way to insert a large number of rows
> from an application? Is the answer the same for 10^3 or 10^8 rows? Do
> the data types involved (e.g. large text field with newlines) make any
> difference to the answer?
>
> Thanks,
>
>          -s
>   
Hi  Stavros,

If you can wait a couple months before you deploy this to production, 
then you can use the Table Function feature which is implemented in the 
development trunk and which will be exposed in the 10.4 release in the 
first quarter of next year.

Table Functions let you make arbitrary external data sources look like 
tables to Derby. Basically, any data source which you can wrap in a 
ResultSet can then be treated by Derby as though it were a table. This 
feature is described in the 10.4 Developer's Guide in the section titled 
"Programming Derby-style Table Functions": 
http://db.apache.org/derby/docs/dev/devguide/ A slide deck explaining 
how to use this feature can be found at 
http://people.apache.org/~rhillegas/vtiDemo/doc/saucerSeparation.html A 
useful toolkit is checked into the development trunk in the 
java/demo/vtis subdirectory. Please read the README file. I think you 
can solve your problem easily by extending StringColumnVTI.

Once you have written your table function and declared it to Derby, you 
can then bulk import your external data stream as follows:

insert into MY_TABLE
select * from table( MY_TABLE_FUNCTION() ) s

Hope this helps,
-Rick

Mime
View raw message