db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <rick.hille...@oracle.com>
Subject Re: Speeding up Derby for very large queries?
Date Mon, 06 Aug 2012 15:36:52 GMT
On 8/6/12 8:16 AM, TXVanguard wrote:
> I'm porting an application from MS Access to Java, and am using Apache Derby
> for the database.
>
> (Note: some of the SQL-related terminology I use below may not be precisely+
> accurate, but hopefully you'll understand my meaning.)
>
> The application has about 75 tables and over 100 queries.  Doing a "SELECT *
> FROM" some of these tables and queries is relatively quick, but many of
> these queries take tens of minutes or even hours.
>
> I have learned that I can speed up some of the queries by defining indexes
> for the tables, but some of my queries are so complex (with multiply layers
> of JOINs, plus WHERE, GROUP BY, HAVING, and ORDER BY clauses) that creating
> indexes does not help.  I have found that if I create temporary tables that
> represent the results of intermediate subqueries, then create indexes on the
> tables, and then use the tables in place of the sub queries, I can speed
> things up considerably, but the amount of work required on my part is
> prohibitive.
>
> Here is an example of part of a typical query (some names have been changed
> to protect the innocent) that I deal with:
>
> CREATE VIEW my_view AS
> SELECT DISTINCT 'Text' AS Cause, pr.ASD, m.Nm AS MN, sn.Nm AS SN, dn.Nm AS
> FTD, 'Text' AS EN, TRIM(CAST(CAST(Count(h.DNID) AS CHAR(30)) AS
> VARCHAR(30)))|| ' label' AS EnFTD
> FROM
>      tblMsg AS m
>      INNER JOIN
>      (
>          tblNo AS dn
>          INNER JOIN
>          (
>              ( tblNo AS sn INNER JOIN tblPR AS pr ON sn.NID = pr.SNID)
>              INNER JOIN
>              tRoutingHops AS h
>              ON (pr.SNID = h.SNID) AND (pr.SPN = h.SPN) AND (pr.SPN = h.OSPN)
> AND (pr.SNID = h.OSNID) AND (pr.ASD = h.ASD)
>          )
>          ON dn.NID = pr.DNID
>      )
>      ON m.ASD = pr.ASD
> WHERE (((pr.DNID)=1 Or (pr.DNID)=2) AND ((h.NetworkID)=0) AND ((h.DNID)<>1
> And (h.DNID)<>2))
> GROUP BY pr.ASD, m.Nm, sn.Nm, dn.Nm
> HAVING ((sn.Nm Like 'ABC%' Or sn.Nm Like 'XY%') And sn.Nm<>'DEF')
> ORDER BY ASD
>
> This query is actually only a sub-query that is joined by a UNION to an even
> larger sub-query to form a full query.
>
> The full query takes over a minute to run in Derby, but only a few tenths of
> a second in Access.   Is there any hope of speeding up these types of
> queries in Derby, or is Derby simple not sophisticated enough to perform the
> types of optimizations required to speed up complicated, multi-level
> queries? (Please note: this is not an attack on Derby.  Derby has
> applications where it is very useful, but I'm wondering if my particular
> application is simply too much for Derby to handle.)
>
> Thanks.
>
It's hard to say anything specific without seeing a problem query and 
its query plan. Maybe too much time is being spent in the optimizer, 
maybe the indexes aren't really useful. That said, the more tables you 
put in a query, the less likely that Derby will pick the best plan. Part 
of the reason for this is that Derby is only capable of executing 
left-deep query plans. Sometimes the best plan is a bushy tree. You can 
simulate bushy query plans by pre-computing interior join results into 
temporary tables. It sounds like this approach does work for you but it 
would be tedious to apply this technique to a hundred queries.

Sorry I can't say more,
-Rick

Mime
View raw message