db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta A. Satoor (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.
Date Thu, 27 May 2010 20:21:36 GMT

     [ https://issues.apache.org/jira/browse/DERBY-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mamta A. Satoor updated DERBY-1482:
-----------------------------------

    Attachment: derby1482_patch3_diff.txt
                derby1482_patch3_stat.txt

Attaching patch derby1482_patch3_diff.txt which along with the core work also takes care of
upgrade. Please share your comments on the patch.

Here are the patch comments
DERBY-1482 will improve on Derby's behavior about which columns should be read in when a trigger
fires. The current behavior of Derby reads all the columns from the trigger table whether
or not all the columns are needed for trigger execution. This can be troublesome when the
trigger table has large data LOB columns causing Derby to run out of memory. With this patch,
Derby will narrow down cases where it should be sufficient to read only a subset of columns
from the trigger table.
eg of such a trigger would be the trigger below which has both trigger columns and trigger
action columns through old/new transition variables
	 * CREATE TRIGGER tr1 AFTER UPDATE OF c12 ON t1
	 *    REFERENCING OLD AS oldt NEW AS newt
	 *    FOR EACH ROW UPDATE t2 SET c24=oldt.j;
For a trigger like above, Derby knows exactly which columns are needed from the trigger table
during trigger execution. A note of those columns to read will be made in SYSTRIGGERS table.

The general rules for what columns will be read in during a trigger execution will be as follows
Rule1)If trigger column information is null, then Derby will read all the columns from trigger
table into memory irrespective of whether there is any trigger action column information.
eg triggers
            create trigger tr1 after update on t1 for each row values(1); 
            create trigger tr1 after update on t1 referencing old as oldt for each row insert
into t2 values(2,oldt.j,-2); 

Rule2)If trigger column information is available but no trigger action column information
is found and no REFERENCES clause is used for the trigger, then read all the columns identified
by the trigger column. eg trigger
            create trigger tr1 after update of c1 on t1 for each row values(1); 

Rule3)If trigger column information and trigger action column information both are not null,
then only those columns will be read into memory. This will be a trigger created in 10.7 or
higher level db. Because prior to that we did not collect trigger action column informatoin.
eg
            create trigger tr1 after update of c1 on t1 referencing old as oldt for each row
insert into t2 values(2,oldt.j,-2); 

Rule4)If trigger column information is available and no trigger action column information
is found but REFERENCES clause is used for the trigger, then we will read all the columns
from the trigger table. I believe that this will cover soft-upgrade and hard-
upgrade scenario during trigger execution of triggers created pre-10.7. This will prevent
us from having special logic of soft-upgrade. 
Additionally, this logic makes invalidation of existing triggers unnecessary during hard-upgrade.
The pre-10.7 created triggers will work just fine even though for some triggers, they would
have trigger action columns missing. A user can choose to drop and recreate such triggers
to take advantage of DERBY-1482 work which will avoid unnecssary column reads during trigger
execution if the trigger action column information was available along with trigger column
info. eg trigger created prior to 10.7
            create trigger tr1 after update of c1 on t1 referencing old as oldt for each row
insert into t2 values(2,oldt.j,-2);
The Rule4) is there to cover triggers created with pre-10,7 releases but now that database
has been hard/soft-upgraded to 10.7
or higher version. Prior to 10.7, we did not collect any information about trigger action
columns. So, both of the 2 kinds of triggers shown above prior to 10.7 will not have any trigger
action column info on them in SYSTRIGGERS table. Without our last rule about what columns
should be read in, we will run into problems for soft-upgrade or hard-upgrade causing us to
not read all the required columns.The trigger eg above if created prior to 10.7 mode will
have trigger column and REFERENCING clause set in SYSTRIGGERS but will not have trigger action
column info even 	though there are columns referenced in trigger action through old/new transition
variables. This same trigger if created in 10.7 or higher 	version will have trigger column,
REFERENCING clause and trigger action column set for it in SYSTRIGGERS. To handle this on
a hard/soft-upgraded db, we will just choose to read all the columns from trigger table if
SYSTRIGGER shows trigger column and REFERENCING clause for a trigger but no trigger action
column. 

Rule5)The only place we will need special code for soft-upgrade is during trigger creation.
If we are in soft-upgrade mode, we want to make sure that we do not save information about
trigger action columns in SYSTRIGGERS because the releases prior to 10.7 do not understand
trigger action column information.
	


Let's see with examples, when the rules described above kick in
Consider following 4 trigger scenarios 
            1) create trigger tr1 after update on t1 for each row values(1); 
            2) create trigger tr1 after update on t1 referencing old as oldt for each row
insert into t2 values(2,oldt.j,-2); 
            3) create trigger tr1 after update of c1 on t1 for each row values(1); 
            4) create trigger tr1 after update of c1 on t1 referencing old as oldt for each
row insert into t2 values(2,oldt.j,-2); 

And these different triggers could be created in following various combination of soft-upgrade,
hard-upgrade, brand new db
	a)trigger is getting created in newly created 10.7 db 
	b)trigger already created in the pre-10.7db before soft upgrade 
	c)trigger is getting created while in soft upgrad mode with pre-10.7 db 
	d)trigger already created in the pre-10.7db before hard upgrade 
	e)trigger is getting created after pre-10.7db is hard upgraded 

All the db variations for trigger scenario 1 will be covered by Rule1).
All the db variations for trigger scenario 2 will be covered by Rule1)
All the db variations for trigger scenario 3 will be covered by Rule2).
For trigger scenario 4, different rules will apply depending on db variations
	For 4a), Rule3) will apply
	For 4b), Rule4) will apply
	For 4c), Rule5) will apply
	For 4d), Rule4) will apply
	For 4e), Rule3) will apply


> Update triggers on tables with blob columns stream blobs into memory even when the blobs
are not referenced/accessed.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-1482
>                 URL: https://issues.apache.org/jira/browse/DERBY-1482
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.2.1.6
>            Reporter: Daniel John Debrunner
>            Assignee: Mamta A. Satoor
>            Priority: Minor
>         Attachments: derby1482_patch1_diff.txt, derby1482_patch1_stat.txt, derby1482_patch2_diff.txt,
derby1482_patch2_stat.txt, derby1482_patch3_diff.txt, derby1482_patch3_stat.txt, derby1482DeepCopyAfterTriggerOnLobColumn.java,
derby1482Repro.java, derby1482ReproVersion2.java, junitUpgradeTestFailureWithPatch1.out, TriggerTests_ver1_diff.txt,
TriggerTests_ver1_stat.txt
>
>
> Suppose I have 1) a table "t1" with blob data in it, and 2) an UPDATE trigger "tr1" defined
on that table, where the triggered-SQL-action for "tr1" does NOT reference any of the blob
columns in the table. [ Note that this is different from DERBY-438 because DERBY-438 deals
with triggers that _do_ reference the blob column(s), whereas this issue deals with triggers
that do _not_ reference the blob columns--but I think they're related, so I'm creating this
as subtask to 438 ]. In such a case, if the trigger is fired, the blob data will be streamed
into memory and thus consume JVM heap, even though it (the blob data) is never actually referenced/accessed
by the trigger statement.
> For example, suppose we have the following DDL:
>     create table t1 (id int, status smallint, bl blob(2G));
>     create table t2 (id int, updated int default 0);
>     create trigger tr1 after update of status on t1 referencing new as n_row for each
row mode db2sql update t2 set updated = updated + 1 where t2.id = n_row.id;
> Then if t1 and t2 both have data and we make a call to:
>     update t1 set status = 3;
> the trigger tr1 will fire, which will cause the blob column in t1 to be streamed into
memory for each row affected by the trigger. The result is that, if the blob data is large,
we end up using a lot of JVM memory when we really shouldn't have to (at least, in _theory_
we shouldn't have to...).
> Ideally, Derby could figure out whether or not the blob column is referenced, and avoid
streaming the lob into memory whenever possible (hence this is probably more of an "enhancement"
request than a bug)... 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message