db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Large multi-record insert performance
Date Thu, 15 Mar 2007 01:06:20 GMT
Do you have some idea what kind of insert performance you need in
your application?  It probably would be easiest for me to understand
in rows per second with some estimate in row size.  Weather data
sounds like it might be very big per row.

So far the suggestions will tend to be single threaded so may not use
up the 2nd processor on your machine, depending on the other work your
app server needs to do this may be just fine.  Derby will automatically 
multiple threads inserting into the same table, but I usually suggest
applications first see if a single thread, with large groups of inserts
per commit, on an embedded derby instance meet their need.  Once you
get into multiple threads you have to start to worry about lock 
waits/deadlocks between the 2/N threads depending out the data.

Inserts will be very logging intensive, so putting the log on a separate 
disk from your data is the first step to avoid being I/O bound.  If you
find your data disk I/O bound then some sort of raid/striping may help

I didn't realize the import and insert questions were coming from the
same app.  I assume the import usage is a one-time thing and then from
your most recent comments you need to stream data into the db as it
comes in.

Sedillo, Derek (Mission Systems) wrote:
> Actually this will be a part of the application logic.  We have real 
> time weather data which we constantly receive and insert into the DB.
> - Derek
> ------------------------------------------------------------------------
> *From:* Mamta Satoor [mailto:msatoor@gmail.com]
> *Sent:* Wednesday, March 14, 2007 5:02 PM
> *To:* Derby Discussion
> *Subject:* Re: Large multi-record insert performance
> If this bulk insert is not normal part of application logic and only 
> done once in a while then I wonder if import using 
> *SYSCS_UTIL.SYSCS_IMPORT _TABLE *would be a faster way to load data.
> Mamta
> On 3/14/07, *Lance J. Andersen* <Lance.Andersen@sun.com 
> <mailto:Lance.Andersen@sun.com>> wrote:
>     Mike Matrigali wrote:
>      >
>      >
>      > Lance J. Andersen wrote:
>      >>
>      >>
>      >> Even if the backend does not provide optimization for batch
>      >> processing, i would hope that there would be still some efficiency
>      >> especially in a networked environment vs building the strings,
>      >> invoking execute() 1000 times in the amount of data on the wire...
>      >>
>      >>
>      > I could not tell from the question whether this was network or
>     not.  I
>      > agree in network then limiting execution probably is best.  In
>     embedded
>      > I am not sure - I would not be surprised if doing 1000 in batch is
>      > slower than just doing the executes.
>      >
>      > In either case I really would stay away from string manipulation
>     as much
>      > as possible and also stay away from things that create very long SQL
>      > statements like 1000 term values clauses.
>     i agree completely.  Let the driver do the heavy lifting :-)

View raw message