what indexes do you have? I would definitely put an index on id. I have blob tables with large image files, and I'm measuring end-to-end delete (from web browser application to server and back) at around 100 ms.
XML? Too much like HTML. It'll never work on the Web!

From: Andrew Lawrenson <andrew.lawrenson@coppereye.com>
To: "derby-user@db.apache.org" <derby-user@db.apache.org>
Sent: Monday, April 20, 2009 2:53:14 AM
Subject: Blobs & Delete speed



I’ve been experiencing a performance problem with deleting blobs in derby, and was wondering if anyone could offer any advice.


This is primarily with under windows and solaris, although I’ve also tested with the new release candidate (as it has many lob changes), but this makes no significant difference.


The problem is that with a table containing many large blobs, deleting a single row can take a long time (often over a minute).


I’ve reproduced this with a small test that creates a table, inserts a few rows with blobs of differing sizes, then deletes them.


The table schema is simple, just:


create table blobtest( id integer generated BY DEFAULT as identity, b blob )


and I’ve then created 7 rows with the following blob sizes : 1024 bytes, 1Mb, 10Mb, 25Mb, 50Mb, 75Mb, 100Mb.

I’ve read the blobs back, to check they have been created properly and are the correct size.


They have then been deleted using the sql statement ( “delete from blobtest where id = X” ).


If I delete the rows in the order I created them, average timings to delete a single row are:


1024 bytes: 19.5 seconds

1Mb: 16 seconds

10Mb: 18 seconds

25Mb: 15 seconds

50Mb: 17 seconds

75Mb: 10 seconds

100Mb: 1.5 seconds


If I delete them in reverse order, the average timings to delete a single row are:

100Mb: 20 seconds

75Mb: 10 seconds

50Mb: 4 seconds

25Mb: 0.3 seconds

10Mb: 0.25 seconds

1Mb: 0.02 seconds

1024 bytes: 0.005 seconds


If I create seven small blobs, delete times are all instantaneous.

It thus appears that the delete time seems to be related to the overall size of the rows in the table more than the size of the blob being removed.

I’ve run the tests a few times, and the results seem reproducible.


So, does anyone have any explanation for the performance, and any suggestions on how to work around it or fix it?  It does make using large blobs quite problematic in a production environment…


Many thanks in advance,