db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick Hillegas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.
Date Fri, 28 May 2010 15:11:36 GMT

    [ https://issues.apache.org/jira/browse/DERBY-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873006#action_12873006
] 

Rick Hillegas commented on DERBY-1482:
--------------------------------------

Thanks for the patch, Mamta. If I understand correctly, users should expect to see the following
behaviors:

1) No behavior change for legacy triggers created before 10.7 is released.

2) No behavior change for triggers created in soft-upgraded databases.

3) Potential performance improvement for triggers created in new 10.7 databases.

4) Potential performance improvement for triggers created in legacy databases after hard-upgrade
to 10.7.

Before looking into the details of this patch, I would like to explore an alternative solution.
Maybe this solution has already been considered and rejected. If so, I apologize for the noise.
This alternative approach would bring the performance improvement to more cases and would
avoid the soft-upgrade and serialization issues. I think that it would re-use most of the
code which you are supplying with the current patch:

A) Do not change what is stored in SYSTRIGGERS.

B) Instead, the very first time that a trigger is run, if there is a REFERENCING clause, re-parse
the trigger text in order to find the columns that are actually needed.

C) Store the extra referenced column information in a transient field of the trigger descriptor
for use by later firings.

The disadvantage of this approach is that the first firing of a trigger would incur an extra
compilation tax. I think that this tax would not be noticed.

The advantage of this approach is that the performance improvement would be seen in cases
(1) and (2) above and not just in cases (3) and (4). In addition, we would avoid the tricky
serialization incompatibilities.

Thanks,
-Rick

> Update triggers on tables with blob columns stream blobs into memory even when the blobs
are not referenced/accessed.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-1482
>                 URL: https://issues.apache.org/jira/browse/DERBY-1482
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.2.1.6
>            Reporter: Daniel John Debrunner
>            Assignee: Mamta A. Satoor
>            Priority: Minor
>         Attachments: derby1482_patch1_diff.txt, derby1482_patch1_stat.txt, derby1482_patch2_diff.txt,
derby1482_patch2_stat.txt, derby1482_patch3_diff.txt, derby1482_patch3_stat.txt, derby1482DeepCopyAfterTriggerOnLobColumn.java,
derby1482Repro.java, derby1482ReproVersion2.java, junitUpgradeTestFailureWithPatch1.out, TriggerTests_ver1_diff.txt,
TriggerTests_ver1_stat.txt
>
>
> Suppose I have 1) a table "t1" with blob data in it, and 2) an UPDATE trigger "tr1" defined
on that table, where the triggered-SQL-action for "tr1" does NOT reference any of the blob
columns in the table. [ Note that this is different from DERBY-438 because DERBY-438 deals
with triggers that _do_ reference the blob column(s), whereas this issue deals with triggers
that do _not_ reference the blob columns--but I think they're related, so I'm creating this
as subtask to 438 ]. In such a case, if the trigger is fired, the blob data will be streamed
into memory and thus consume JVM heap, even though it (the blob data) is never actually referenced/accessed
by the trigger statement.
> For example, suppose we have the following DDL:
>     create table t1 (id int, status smallint, bl blob(2G));
>     create table t2 (id int, updated int default 0);
>     create trigger tr1 after update of status on t1 referencing new as n_row for each
row mode db2sql update t2 set updated = updated + 1 where t2.id = n_row.id;
> Then if t1 and t2 both have data and we make a call to:
>     update t1 set status = 3;
> the trigger tr1 will fire, which will cause the blob column in t1 to be streamed into
memory for each row affected by the trigger. The result is that, if the blob data is large,
we end up using a lot of JVM memory when we really shouldn't have to (at least, in _theory_
we shouldn't have to...).
> Ideally, Derby could figure out whether or not the blob column is referenced, and avoid
streaming the lob into memory whenever possible (hence this is probably more of an "enhancement"
request than a bug)... 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message