db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Question/observations on purged rows and insert rates
Date Fri, 06 Aug 2010 22:25:54 GMT
Thanks for the detail.  My guess is that the issue is log I/O to the log 
device which in your case is also competing with the database I/O.
When you do the delete it is likely also killing your cache as it needs
to read in all the old data pages to do the delete.

You can probably make even the new system more scalable if you
can build in something that would group the inserts into more than one
per commit.  I would not be surprised if you could get at least a 10 
times more inserts per second if you could group 10 per commit.  This
will help even when you change to the partitioned structure as you will
be able to handle way more inserts as you add even more devices.

I think you could really optimize the insert speed if you could funnel
all the inserts to one thread and have it batch the inserts and commit
them something like once a second.  Since you are not cpu bound at all
I don't think the one thread would be a bottle neck.

I do agree though that the partitioned approach will likely be the most
scalable as the delete part will have almost no overhead in this scheme,
and is guaranteed to return all the space to the OS.

Bergquist, Brett wrote:
> The data is actually coming from test measurements from devices in a network.  For testing
purposes, I was able to change the firmware in a device to generate more tests per second
to up the rate.  So there are more things going on than just database work:
> 	- thread receiving test measurements from many devices and decoding data
> 	- passing data off to a Java Resource Connector (JCA Resource Adapter) to an endpoint
(Message Driven Bean)
> 	- inserting the data
> So yes, the inserts are being committed for each measurement.  In the real world, we
are seeing about 8 inserts/second continuous every second of every day.  This is going up
however as more devices are added to the network.  We have gone from about 100K measurements
to almost 800K measurements in the last 6 months, so growth is important.
> The purging of the data is kicked off by a Quartz Job that purges out measurement results
that are older than 30 days (configurable).  This is done in chunks of 1000 results per transaction
commit to keep the transaction log size down but still have somewhat good performance.  Right
now this takes about 2 hours of time with the inserts continuing to come in.  One problem
is that the deletion is effecting the insertion rate.  The deletion does not really have to
be done "as fast as possible", only that it be done by before the next day.  So I have done
some changes to allow both the deletion to occur without affecting the insertion rate enough
to where it becomes a problem.  This will hold us over for a few months.
> In tuning this, I saw this issue that the insertion rate stays low long after the deletion
is over and done with.  I can tell by the CPU usage and the disk I/O usage as I know the pattern
of normal insertion.
> While this has worked okay for the last few years, my model is flawed and a better usage
will be partition the data.
> The application is running at a customer site and the system is an old Sun V480 running
Solaris 10.  4 CPU's and 8Gb of memory.  The system is not CPU bound by any means as it is
usually about 70% idle.  It might start to be I/O bound during the deletion phase as the disks
are not that large or current.  I know I should move the transaction log onto a second disk,
but the up until just recently the complexity of the setup outweighed the increase in performance.
 This is pretty much a headless system that is never looked at as long as it works correctly.
 No sysadmin, no DBA looking at the system.  So up until the recent dramatic increase of the
number of devices in the network, Derby has worked very well.
> Again, I appreciate the feedback.  I will look next week to write a standalone Java application
that will simulate this setup and see what I find.
> Brett
> -----Original Message-----
> From: Mike Matrigali [mailto:mikem_app@sbcglobal.net] 
> Sent: Friday, August 06, 2010 1:57 PM
> To: derby-dev@db.apache.org
> Subject: Re: Question/observations on purged rows and insert rates
> Bergquist, Brett wrote:
>> Also after testing we are seeing the following. With a database with no 
>> deleted rows, my test application is inserting about 150 records/second 
>> into a table.  I let this run for about 2 million records and the insert 
>> rate is consistent.  Now I purge out 1.5 million records and run the 
>> test again.   The insert rate is now about 25 records/second.  Running 
>> the above compress with the PURGE_ROWS option and rerun the test and 
>> still about 25 records/second.  Run full SYSCS_UTIL.SYSCS_COMPRESS_TABLE 
>> and rerun the test and the insert rate is back to 150 records/second.
> Both of these insert rates seem really slow.  I am going to guess
> that you do one commit per insert, is that right?

View raw message