db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mike matrigali <mikema...@gmail.com>
Subject Re: Issue with large delete in derby
Date Thu, 30 Jan 2014 18:57:30 GMT
I think more information is needed to be able to help.  For instance:
o ddl of the table and indexes and referential constraints
o query plan of an example delete that takes longer than you expect.
Posting a reproducible case using dummy data is the easiest way for the
community to help you.

>From the query plan we can tell you if derby is doing expected row level
locking, and using expected indexes or if for some reason it
is doing a whole table scan.  The table scan could be because application
does not have appropriate indexes defined or it could be because derby has
chosen a bad plan.

For plan problems in 10.5.1.1 I would suggest making sure to run
update statistics on every table in your system.  Or suggest you upgrade
to the latest derby 10.10 derby release which will automatically update the
statistics in background for you.

I don't know if it fits your application model, but a number of derby
applications that follow the pattern of insert a bunch of connected data
and then delete it all later often use a different table for each of this
connected data.  The usual case is a unit of time of data is collected (say
a week) and then the next week is put into a separate set
of tables and the previous week is all deleted at once by dropping the
table.  The benefit is that a drop of a table in derby does way less
work than individually deleting each row in the table.  This is for
a few reasons:
1) delete must log a record for each row deleted, while drop need only log
a few records associated with dropping the table(s)
2) delete must do extra post commit work to recover space for subsequent
inserts, while drop need not do any.

If these applications need to run queries across all the related tables
there are a few options:
1) hand build a union
2) use table functions to make all the related tables look like one for
selects.


One other suggestion.  Your application sounds for I/O dependent.  If you
have more than one disk on your system
it is possible with derby to place the transaction log on one disk and the
database on a different disk to get better I/O
performance.

On Wed, Jan 29, 2014 at 10:46 PM, Amitava Kundu1 <amitavakundu@in.ibm.com>wrote:

>
> Hi,
> We are using embedded derby 10.5.1.1 in our product, This derby database is
> used as regular RDBMS where lot of insert, delete and select happens, There
> are business entities each of its occurrence could be of size 10 GB and
> upward e.g. a huge log file data.
> In our application, we use cascade delete and also has referential
> integrity constraints ON.
>
> This application runs on 64 bit Linux with 8 GB RAM allocated to JVM.
> Similar time is observed in our development Windows box.
>
>  It takes more than 3 hour to delete those entities. During this time all
> the relevant tables stay locked and no other operation is feasible.
>
> We'd like know what could be different options/ strategy be adopted for:
>    Speeding up the delete process
>    Ability to other database activities in parallel
>
>
> Thanks
>         Amitava Kundu
>
>

Mime
View raw message