Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Derby Discussion" <derby-user@db.apache.org>
Received-SPF: pass (athena.apache.org: domain of lists@nabble.com designates
 216.139.236.158 as permitted sender)
Message-ID: <19039527.post@talk.nabble.com>
Date: Mon, 18 Aug 2008 13:36:23 -0700 (PDT)
From: Iwud H8u <iwudh8u@hotmail.com>
To: derby-user@db.apache.org
Subject: Re: memory usage for row delete cascade
In-Reply-To: <48A9CC08.4070403@sun.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
References: <BAY113-W52953F3CB3E5D316C5DFFFE56E0@phx.gbl>
 <20080818121440.E42255D7F0@dbrack01.segel.com>
 <19034510.post@talk.nabble.com> <48A9A361.6000007@sun.com>
 <19035693.post@talk.nabble.com> <48A9CC08.4070403@sun.com>


Rick Hillegas-2 wrote:
 
> Hi Jay,
> 
> You may be able to get the behavior you want by adding a DELETE trigger 
> to child_info or to progeny. The trigger would fire a Java PROCEDURE to 
> keep your tables in sync. For more information, please see the "CREATE 
> TRIGGER statement" section of the Reference Guide. Alternatively, if the 
> orphaned parent_info rows are harmless cruft, you might get away with 
> garbage-collecting them lazily when your application is idle.

Yes, I was thinking of going the way of the triggers if the on delete
cascade behaviour doesnt not improve. Hopefully that should get rid of my
little problem. The ON DELETE cascade seemed like a much more
straightforward way of doing it. Also I never experienced this issue with
HSQLDB which I was using before I migrated to Derby. I migrated because
Derby offers encryption out of the box unlike HSQLDB ... :)  See below for
why I am pointing fingers at Derby ...

>> I am still hoping that someone might answer my original memory
>> usage/computational overhead question on row deletion.... Do you have any
>> ideas about that?
>>   
> It sounds as though you have embedded Derby inside a sophisticated UI. 
> Why do you think the memory/cpu burp is in Derby rather than somewhere 
> else in your application?

Yes, I have tried to create a rich client (using some ideas from the filthy
java clients book) but then I am pretty sure my UI is not to blame because I
profile the app resource usage regularly. Also here is a debugging script
which just runs a delete statement ... (script below). The time for this
little script to execute is 6000ms (6 sec) which for a UI isn't acceptable.
Also it spikes the CPU usage to 100% for a little while and freezes up the
app (ultimately causing an OOM exception).

        logger.debug("Before calling create statement");
        long before = Calendar.getInstance().getTimeInMillis();
        Statement st = connection.createStatement();
        logger.debug("statement created");
        int result = st.executeUpdate("DELETE FROM ATTENDANCE_INFO WHERE
CHILD_ID = "+2);
        long after = Calendar.getInstance().getTimeInMillis();
        logger.debug("Query time = "+(after-before));

I run this bit of code from a simple test class and the memory spikes from
40MB to 100-120MB just as after it creates the statement ... 


Rick Hillegas-2 wrote:
> 
> Iwud H8u wrote:
>> Hi Rick,
>>
>> That is a good suggestion. I had originally envisaged a new row for each
>> child in the parent_info table. Your schema suggestion looks good expect,
>> when deleting a child record, I will now have to delete the parent from
>> parent_info after checking no other child has the same parent!
>>   
> Hi Jay,
> 
> You may be able to get the behavior you want by adding a DELETE trigger 
> to child_info or to progeny. The trigger would fire a Java PROCEDURE to 
> keep your tables in sync. For more information, please see the "CREATE 
> TRIGGER statement" section of the Reference Guide. Alternatively, if the 
> orphaned parent_info rows are harmless cruft, you might get away with 
> garbage-collecting them lazily when your application is idle.
>> I am still hoping that someone might answer my original memory
>> usage/computational overhead question on row deletion.... Do you have any
>> ideas about that?
>>   
> It sounds as though you have embedded Derby inside a sophisticated UI. 
> Why do you think the memory/cpu burp is in Derby rather than somewhere 
> else in your application?
> 
> Regards,
> -Rick
> 
> 
>> Thanks,
>> Jay
>>
>>
>> Rick Hillegas-2 wrote:
>>   
>>> Hi Jay,
>>>
>>> I have a couple comments about your schema:
>>>
>>> 1) You don't need to create ChildIdIndex. This is because Derby creates 
>>> backing indexes for UNIQUE, PRIMARY KEY, and FOREIGN KEY constraints. 
>>> For more information, please see the section titled "CONSTRAINT clause" 
>>> in the Derby Reference Guide: http://db.apache.org/derby/docs/10.4/ref/
>>>
>>> 2) I am puzzled by the relationship between child_info and parent_info. 
>>> Do you really intend a parent to have only one child? Or have you 
>>> denormalized the parent_info table so that each parent has multiple rows 
>>> in parent_info, one for each of their children? You may want to consider 
>>> normalizing this schema as follows:
>>>
>>> a) remove the child_id column from parent_info
>>>
>>> b) create an additional table to model the many-to-many relationship 
>>> between parents and children:
>>>
>>> create table progeny
>>> (
>>>     parent_id int not null foreign key references parent_info( parent_id 
>>> ) on delete cascade,
>>>     child_id int not null foreign key references child_info( child_id ) 
>>> on delete cascade
>>> )
>>>
>>> Hope this helps,
>>> -Rick
>>>
>>> Iwud H8u wrote:
>>>     
>>>> Michael Segel wrote:
>>>>   
>>>>       
>>>>> Since you're a bit cryptic..
>>>>>
>>>>>  Hmm sorry I wasn't intending to be cryptic ... was trying not to be
>>>>> verbose! 
>>>>>
>>>>> You are right about the column IDs ... lemme post the actual SQL
>>>>> statements for creating the tables themselves... nothing confidential
>>>>> in
>>>>> them I guess.... :)
>>>>>
>>>>> ==================== tables  ==========================
>>>>>  CREATE TABLE Child_info (child_id INTEGER NOT NULL PRIMARY KEY
>>>>> GENERATED
>>>>> ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1) , 
>>>>>                         first_name VARCHAR(32) NOT NULL, middle_name
>>>>> VARCHAR(32) NOT NULL, 
>>>>>                         last_name VARCHAR(32) NOT NULL)
>>>>>
>>>>> CREATE TABLE parent_info (parent_id INTEGER NOT NULL PRIMARY KEY
>>>>> GENERATED
>>>>> ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1) , " +
>>>>>                         "first_name VARCHAR(32) NOT NULL, middle_name
>>>>> VARCHAR(32) DEFAULT ' ', " +
>>>>>                         "last_name VARCHAR(32) NOT NULL,  child_id
>>>>> INTEGER
>>>>> NOT NULL ,
>>>>> FOREIGN KEY(child_id) REFERENCES Child_info (child_id) ON DELETE
>>>>> CASCADE)
>>>>>
>>>>> CREATE TABLE attendance_info (attendance_id INTEGER NOT NULL PRIMARY
>>>>> KEY
>>>>> GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1) , 
>>>>>                         monday VARCHAR(16),
>>>>>                         tuesday VARCHAR(16), wednesday VARCHAR(16) ,
>>>>> thursday VARCHAR(16) ,
>>>>>                         friday VARCHAR(16) ,
>>>>> UNIQUE (child_id) , 
>>>>> FOREIGN KEY(monday) REFERENCES sessions (session_name)  ON DELETE
>>>>> CASCADE, 
>>>>>                         FOREIGN KEY(tuesday) REFERENCES sessions
>>>>> (session_name) ON DELETE CASCADE, FOREIGN KEY(wednesday) REFERENCES
>>>>> sessions (session_name) ON DELETE CASCADE, 
>>>>>                         FOREIGN KEY(thursday) REFERENCES sessions
>>>>> (session_name) ON DELETE CASCADE, 
>>>>>                         FOREIGN KEY(friday) REFERENCES sessions
>>>>> (session_name) ON DELETE CASCADE, 
>>>>>                         FOREIGN KEY(room_name) REFERENCES rooms
>>>>> (room_name) ON DELETE CASCADE, 
>>>>>                         FOREIGN KEY(child_id) REFERENCES Child_info
>>>>> (child_id)  ON DELETE CASCADE )
>>>>>
>>>>> As you can see the lines in bold declare the primary key and foreign
>>>>> keys
>>>>> (with on delete cascade). 
>>>>>
>>>>> I am guessing child_id does not need a separate index to be generated
>>>>> because it is the primary key in Child_info table and atttendance_info
>>>>> tables. However child_id cant be either unique or primary in
>>>>> parent_info
>>>>> table because there will be atleast two parents with the same
>>>>> child_id.
>>>>>
>>>>> I however created and index on this table using 
>>>>>
>>>>> CREATE INDEX ChildIdIndex ON PARENT_INFO(CHILD_ID)
>>>>>
>>>>> My query times for something like 
>>>>>
>>>>> DELETE FROM CHILD_INFO WHERE CHILD_ID = 1
>>>>>
>>>>> is around 6 seconds with only two row entries CHILD_INFO, 6 row
>>>>> entries
>>>>> in
>>>>> PARENT_INFO and just one entry in ATTENDANCE_INFO!
>>>>>
>>>>> Hope this makes sense... 
>>>>>
>>>>> On Table A, you show Col_x_ID, but in your foreign key, you show
>>>>> Col_x.
>>>>> I'm
>>>>> going to assume that you meant Col_x_ID.
>>>>>
>>>>>  
>>>>>
>>>>> On Table A, do you have an index on Col_x_ID? Is Col_x_ID a unique ID?
>>>>> If
>>>>> so, is this the primary key for the table?
>>>>>
>>>>>  
>>>>>
>>>>> That would be the first thing to look at. If no index exists, my guess
>>>>> would
>>>>> be that you're doing a sequential table scan.
>>>>>
>>>>>  
>>>>>
>>>>> HTH
>>>>>
>>>>>  
>>>>>
>>>>> -Mike
>>>>>
>>>>>  
>>>>>
>>>>>  
>>>>>
>>>>>   _____  
>>>>>
>>>>> From: jay _ [mailto:iwudh8u@hotmail.com] 
>>>>> Sent: Monday, August 18, 2008 4:16 AM
>>>>> To: derby-user@db.apache.org
>>>>> Subject: memory usage for row delete cascade
>>>>>
>>>>>  
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I have a database with three tables (Table_A, Table_B and Table_C).
>>>>> There
>>>>> is
>>>>> one identity column in Table_A which acts as a foreign key in Tables B
>>>>> and
>>>>> C
>>>>> with an ON DELETE CASCADE.
>>>>>
>>>>> Here is the table structure .... 
>>>>>
>>>>> Table A : Col_x_ID , Coly_Name, .... 
>>>>> Table B : Col_p_ID, Col_q_Name, .... Col_x ... FOREIGN KEY(Col_x)
>>>>> REFERENCES
>>>>> Table A (Col_x) ON DELETE CASCADE)
>>>>> Table B : Col_m_ID, Col_n_Name, .... Col_x ... FOREIGN KEY(Col_x)
>>>>> REFERENCES
>>>>> Table A (Col_x) ON DELETE CASCADE)
>>>>>
>>>>> When I now try to delete a row in Table_A, I experience a spike in CPU
>>>>> usage
>>>>> to almost 100% and also the memory  usage bumps from 40MB to 100MB.
>>>>> This
>>>>> remains sustained for a few seconds (like 5 seconds) causing the UI to
>>>>> almost freeze in my application. Even worse, it causes an Out of
>>>>> Memory
>>>>> Exception!
>>>>>
>>>>> I am pretty sure that something isn't quite right with what I am
>>>>> experiencing. I know row inserts and deletes are computationally
>>>>> expensive,
>>>>> but surely it has to be less expensive than what I am experiencing. 
>>>>>
>>>>> I'd appreciate if anyone can tell me how to optimise this or is there
>>>>> a
>>>>> more
>>>>> efficient method to delete entries with a cascade effect?
>>>>>
>>>>> I am running Derby version 10.4.1.3 on MacOSX with JRE 1.5 version 6.
>>>>>
>>>>> Thanks,
>>>>> Jay
>>>>>
>>>>>   _____  
>>>>>
>>>>> Get thousands of games on your PC, your mobile phone, and the web with
>>>>> WindowsR. Game with <http://clk.atdmt.com/MRT/go/108588800/direct/01/>
>>>>> Windows
>>>>>
>>>>>
>>>>>
>>>>>     
>>>>>         
>>>>   
>>>>       
>>>
>>>     
>>
>>   
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/memory-usage-for-row-delete-cascade-tp19028179p19039527.html
Sent from the Apache Derby Users mailing list archive at Nabble.com.