db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Segel" <mse...@segel.com>
Subject RE: Strange performance issue with BLOB's
Date Wed, 12 Jul 2006 13:58:20 GMT
Ok... Is it a bug? If so where?

I mean, heck. How should the optimizer react when it has no hints from the
select statement. One would think that a table scan would make sense.
Especially when you have such a small data set.

The interesting thing is that the table scan took so long when you only had
*27* rows?

Unless you're saying that when Derby does a table scan, its doing a complete
scan on the blobs itself?  (Meaning it would load the blob itself in to
memory?)

Then that doesn't make sense.

I'm going to assume that this is a repeatable problem.
So one has to ask, what happens in a table scan? 
(And that's a loaded question. I mean do you need the blob, or just the blob
header based on the query presented?)

But hey!
What do I know?
I'm still thinking about a way to walk a B+ Tree to find an open slot
quickly.... ;-) (Then I'll need to figure out the math to prove it. ;-)

-G


> -----Original Message-----
> From: Piet Blok [mailto:pbhome@wanadoo.nl]
> Sent: Wednesday, July 12, 2006 12:46 AM
> To: Derby Discussion
> Subject: Strange performance issue with BLOB's
> 
> Hi all,
> 
> When experimenting with BLOB's I ran into a performance issue
> that I cannot completely explain, but it could be a bug.
> 
> Given the following table:
> 
> CREATE TABLE BLOB_TABLE (
> BLOB_ID BIGINT NOT NULL GENERATED ALWAYS AS IDENTITY (START WITH 1,
> INCREMENT BY 1),
> BLOB_SIZE BIGINT NOT NULL,
> BLOB_CRC BIGINT NOT NULL,
> BLOB_NAME VARCHAR(255) NOT NULL,
> BLOB_DATA BLOB(2G) NOT NULL,
> UNIQUE (BLOB_CRC, BLOB_SIZE),
> PRIMARY KEY (BLOB_ID)
> );
> 
> which is populated with 27 rows,
> where the sum of all BLOB sizes is 5,885,060,164 bytes
> (about 200 MB average per BLOB, but ranging from 10 MB to 750 MB).
> 
> Some queries on this table are executed really
> fast (almost instantaneous response).
> 
> However, the following query needs about 10 minutes to complete:
> 
> SELECT BLOB_ID, BLOB_NAME, BLOB_SIZE, BLOB_CRC FROM BLOB_TABLE;
> 
> I reasoned that maybe Derby is scanning the whole table
> (including the blob contents) so I tried to add a dummy WHERE
> clause (dummy because all BLOB_ID's are greater than 0)
> to offer a clue as to what rows (all of course) are needed,
> as follows
> 
> SELECT BLOB_ID, BLOB_NAME, BLOB_SIZE, BLOB_CRC FROM BLOB_TABLE WHERE
> BLOB_ID
>  > 0;
> 
> and that helped: instantaneous response.
> 
> But I really think that the original query,
> without the where clause, should not be any slower.
> 
> 
> I am using Derby 10.1.3.1 embedded, Windows XP and Sun Java 1.5.0_06.
> Both queries executed with a Statement, not a PreparedStatement.
> 
> Kind regards,
> 
> Piet Blok
> 




Mime
View raw message