db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Question on whether to use multiple databases/multiple network servers or not
Date Thu, 22 Mar 2012 00:25:00 GMT
Bergquist, Brett wrote:
> We have a system in production that has the following characteristics:
> ·         Provisioning data for various network devices.  There is a 
> large number of tables and rows for each device by the data changes 
> infrequently but changes have to be responsive as this data is accessed 
> by a human with a user interface
> ·         Performance data that is being inserted into the database at 
> about 6.5M records per day.  Queries are also done on this data for 15 
> minute intervals and also every 4 hours.  The inserts are non-stop and 
> the queries are periodic.  The inserts need to be responsive as this 
> data is being generated by network devices and needs to keep up.
> ·         There is one database that contains both kinds of data.
> We are running into a performance problem particularly with provisioning 
> data.   Without the performance data being inserted, the provisioning 
> changes are performing okay but these are affected a great deal when the 
> performance data is being inserted at such a high rate.

is the provisioning problem with read only transactions, or are they
write also?  My interpretation is that all tables used by provisioning
are different than those being inserted by performance.
> There are enough connections to the database engine.
> The system is an Oracle M5000 with 32 processors and 32Gb of memory.    
> Looking at CPU utilization and the system is about 10% utilized.    It 
> appears that the system is not I/O bound as of yet.
can you talk to number of threads provisioning and performance are 
using.  Derby will not do much to break up a single connections work across
multiple threads.  So a single inserter may be cpu bound but
looking at the machine it will only be 1/32 utilized, and derby will not 
go faster for that connection.  In general derby
does a good job of making each incoming connection a different thread 
and running as many of them in parallel as possible as long as there is
not database lock contention.

What is the disk situation?  One disk, multiple disks, maybe multiple 
disks presented as a single disk?  When talking about moving to multiple
dbs, it would likely make a lot of sense if you could spread the i/o to
multiple disks putting one db on each.  Better is if the OS just handles
this by presenting multiple disks as one.
> I was wondering if it would make sense to separate out the performance 
> data into its own database and potentially its own JVM though a second 
> Network Service running.    This will lead to some complexities when 
> trying to correlate the performance data back with the provisioning data 
> when needed (now to separate databases).    I was wondering if there is 
> any thoughts on if this might help separate any contention in the single 
> database that might exist and allow better performance for the 
> provisioning information
Any chance you can prototype on a test machine easily first to verify
it removes the bottleneck.

some bottlenecks in a single database include:
1 log file, so writes are somewhat blocked by other writes, but there is
      software to do "group" commit to optimize this.
1 background processing thread, so background work can become bottle 
neck, this is mostly an issue if you are doing lots of deletes or 
updates of key fields in indexes.

1 disk for data per database, and optional second disk for log.  This is 
an obvious bottleneck if you config has multiple disks that derby is not

There are other shared caches per database that are shared but i would 
be surprised if throughput would be affected with such available cpu: 
query cache, database cache, open file cache, jvm garbage collector, ...
> I almost would like it to be that the provisioning database access have 
> higher priority than the performance data since it is infrequent but 
> needs to be responsive.
> Any thought would be greatly appreciated.
> Brett

View raw message