db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Matrigali (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.
Date Wed, 27 Jan 2010 18:02:34 GMT

     [ https://issues.apache.org/jira/browse/DERBY-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Mike Matrigali updated DERBY-1482:

Does anyone know what guarantees there are on the "old" and "new" values in
the trigger action.  I understand they should represent before and after values
with respect to triggering statement.  But must they be unaffected by actions
taken in the trigger action.

So for instance you have a before trigger, and the trigger action changes
the value of the row twice.  Do we guarantee that the "old" value will not

I ask this while trying to understand what is necessary for LOB old and new

I am trying to understand how this issue could be addressed without doing
the work Kathey describes to figure out what old and new column values are
referenced.  Without this info I don't think it is possible to guarantee
an old value of a LOB around without somehow saving a copy of the whole stream.
The easiest to understand case is if someone actually updates the LOB
between the time the row was first saved and when the trigger action
references it.  In this case if we have not drained the stream before then
it is not guaranteed to still be valid in the store.  The store does not
guarantee access to the old lob if it has been updated.  Maintaining a lock
on the row will stop other threads from updating it but it could still be
changed by the same transaction.

It would not be hard to figure out if the trigger statement could change
various columns.  It would be slightly harder to figure out if the trigger
action could change the values, as it could be multiple statements.

On the face of this I only see the following possibilities:
1) do the work to figure out what columns the trigger action accesses.  Then
   we only need to save old/new values of those columns referenced.

2) Rather than reading the lobs into memory, existing interfaces could be used
   to stream the lobs to temp tables, and then save a stream reference to these.
   But this still would cause a lot of overhead for a trigger that never
   references the lobs.

3) Define the behaviour in a subset of the cases such that we can use the
   existing lob in the db, and just maintain a stream reference to it.  Somethin
g like:
   o after trigger uses existing lob if trigger statement does not update
     lob column.  Ignore trigger action possibly changing the lob value.
   o before triggers always use exising lob in db, ignores possibility of
     same xact changing the lob value in trigger action.

> Update triggers on tables with blob columns stream blobs into memory even when the blobs
are not referenced/accessed.
> ---------------------------------------------------------------------------------------------------------------------
>                 Key: DERBY-1482
>                 URL: https://issues.apache.org/jira/browse/DERBY-1482
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions:
>            Reporter: Daniel John Debrunner
>            Priority: Minor
>         Attachments: derby1482Repro.java
> Suppose I have 1) a table "t1" with blob data in it, and 2) an UPDATE trigger "tr1" defined
on that table, where the triggered-SQL-action for "tr1" does NOT reference any of the blob
columns in the table. [ Note that this is different from DERBY-438 because DERBY-438 deals
with triggers that _do_ reference the blob column(s), whereas this issue deals with triggers
that do _not_ reference the blob columns--but I think they're related, so I'm creating this
as subtask to 438 ]. In such a case, if the trigger is fired, the blob data will be streamed
into memory and thus consume JVM heap, even though it (the blob data) is never actually referenced/accessed
by the trigger statement.
> For example, suppose we have the following DDL:
>     create table t1 (id int, status smallint, bl blob(2G));
>     create table t2 (id int, updated int default 0);
>     create trigger tr1 after update of status on t1 referencing new as n_row for each
row mode db2sql update t2 set updated = updated + 1 where t2.id = n_row.id;
> Then if t1 and t2 both have data and we make a call to:
>     update t1 set status = 3;
> the trigger tr1 will fire, which will cause the blob column in t1 to be streamed into
memory for each row affected by the trigger. The result is that, if the blob data is large,
we end up using a lot of JVM memory when we really shouldn't have to (at least, in _theory_
we shouldn't have to...).
> Ideally, Derby could figure out whether or not the blob column is referenced, and avoid
streaming the lob into memory whenever possible (hence this is probably more of an "enhancement"
request than a bug)... 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message