db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta A. Satoor (JIRA)" <j...@apache.org>
Subject [jira] Updated: (DERBY-1482) Update triggers on tables with blob columns stream blobs into memory even when the blobs are not referenced/accessed.
Date Mon, 15 Feb 2010 22:38:28 GMT

     [ https://issues.apache.org/jira/browse/DERBY-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mamta A. Satoor updated DERBY-1482:
-----------------------------------

    Attachment: derby1482ReproVersion2.java

The current reproducible attached to this jira derby1482Repro.java really runs into OOM because
it can't get enough page cache from the heap available to the jvm. This problem can be easily
resolved by decreasing the amount of page cache requested from the heap. This can be achieved
by using the property -Dderby.storage.pageCacheSize=smallerpagecache. Once the page cache
requirement is lowered derby1482Repro.java runs fine without any problems with LOB streaming.
The reason we don't run into OOM for streaming of LOB with derby1482Repro.java is that the
LOB data size is not big enough. derby1482Repro.java is inserting only 30K data into the 2G
LOB(notice I am talking about data size and not column size. LOB column is declared as 2G
but data loaded into it is only 30K). Store does not stream data out of store when the data
size is less than the page size (in Derby, page size is 32K). In order for us to cause the
data to be read in the memory, the data size has to be bigger. 

What we are really after is a reproducible program which runs into OOM while LOBs are being
streamed into memory as part of trigger execution when the LOBs are really not needed to be
streamed in because they are not being accessed/updated. I have spent some time on coming
up with a repro for OOM as a result of LOB streaming. That repro is attached as derby1482ReproVersion2.java
This test case runs into OOM while streaming the LOB because now the LOB data is 320M in size,
much larger than the page size of 32K. Since the BLOB is so large, Store will stream it into
memory, thus running into OOM). 
To reproduce the problem, use the following steps
1)First create the database with 3 populated tables using the attached java program 
derby1482ReproVersion2.java as follows 
java org.apache.derbyTesting.functionTests.tests.lang.derby1482ReproVersion2
Some information about the 3 tables 
   a)Table t1 has a BLOB column along with other columns. 
   b)Table t3 is similar to t1 but without a BLOB column 
   c)Table t1 has an update trigger defined on it which updates table t2 in the trigger action.

   d)Table t3 also has an update trigger defined on it which also updates table t2 in the
trigger action. 
2)To reproduce the problem of OutOfMemory, connect to the database using ij with very limited
heap 
java -Xmx4m -Dderby.storage.pageCacheSize=40 -Dij.exceptionTrace=true org.apache.derby.tools.ij

connect 'jdbc:derby:testDB1482;create=true'; 
--following update of table t3 works with no problem. This is the table with no BLOB column

update t3 set status = 1 where id = 1; 
--But the following update will fail with OOM. This update is on the table with BLOB column

update t1 set status = 1 where id = 1; 

> Update triggers on tables with blob columns stream blobs into memory even when the blobs
are not referenced/accessed.
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-1482
>                 URL: https://issues.apache.org/jira/browse/DERBY-1482
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.2.1.6
>            Reporter: Daniel John Debrunner
>            Priority: Minor
>         Attachments: derby1482DeepCopyAfterTriggerOnLobColumn.java, derby1482Repro.java,
derby1482ReproVersion2.java
>
>
> Suppose I have 1) a table "t1" with blob data in it, and 2) an UPDATE trigger "tr1" defined
on that table, where the triggered-SQL-action for "tr1" does NOT reference any of the blob
columns in the table. [ Note that this is different from DERBY-438 because DERBY-438 deals
with triggers that _do_ reference the blob column(s), whereas this issue deals with triggers
that do _not_ reference the blob columns--but I think they're related, so I'm creating this
as subtask to 438 ]. In such a case, if the trigger is fired, the blob data will be streamed
into memory and thus consume JVM heap, even though it (the blob data) is never actually referenced/accessed
by the trigger statement.
> For example, suppose we have the following DDL:
>     create table t1 (id int, status smallint, bl blob(2G));
>     create table t2 (id int, updated int default 0);
>     create trigger tr1 after update of status on t1 referencing new as n_row for each
row mode db2sql update t2 set updated = updated + 1 where t2.id = n_row.id;
> Then if t1 and t2 both have data and we make a call to:
>     update t1 set status = 3;
> the trigger tr1 will fire, which will cause the blob column in t1 to be streamed into
memory for each row affected by the trigger. The result is that, if the blob data is large,
we end up using a lot of JVM memory when we really shouldn't have to (at least, in _theory_
we shouldn't have to...).
> Ideally, Derby could figure out whether or not the blob column is referenced, and avoid
streaming the lob into memory whenever possible (hence this is probably more of an "enhancement"
request than a bug)... 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message