db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kristian Waagan <Kristian.Waa...@Sun.COM>
Subject Re: Derby in-memory back end - where to go next?
Date Fri, 25 Sep 2009 09:45:53 GMT
Rick Hillegas wrote:
> Hi Lily,
>
> Some responses inline...
>
> Lily Wei wrote:
>>
>> Hi Rick:
>>
>>      I have some follow up questions.
>> Middle-tier caching, monitoring transient data streams and test rigs 
>> totally make sense.
>>
>> Do you see any benchmark in turn of how derby helps these applications?
>>
> I don't think we've published any figures on the performance boost you 
> get from running in memory. My anecdotal recollection is that you see 
> a significant boost once you've gotten past database creation. 
> Kristian has done the most extensive testing and may have some figures 
> that he can share. Unfortunately, he suffered an accident earlier this 
> week and is up on the blocks for a while.

Hello Rick and Lily,

The performance benefit you'll see with the in-memory back end is highly 
dependent on the load and the underlying disk subsystem.
For write intensive loads the boost can be in orders of magnitude.
For read intensive loads the boost can be close to zero.

If you have a read-only database, it may be better in some cases to keep 
the database on disk, maximize the page cache size and then prime the 
cache (pulling all pages into the cache).
The downside of using the in-memory back end in such a scenario, is that 
some of the data will be stored twice: once in the "virtual in-memory 
file system" and once in the page cache. For the same reason, you should 
tweak the page cache size accordingly to your amount of data and heap 
size. Minimizing the page cache  (i.e. allowing only 40 pages) to avoid 
the "data duplication" problem is not a good idea for optimal performance...
For some more information about the effects of page cache size and page 
size, see [1]. It is really a comparison between two implementations of 
an in-memory back end, but closer to the end of the document there are 
some relevant experiments.

Unfortunately I'm unable to find the numbers I had comparing the disk 
based back end with the in-memory back end.
If anyone wants some hard numbers, they can try running the various 
performance clients found in the source code repository (under 
trunk/testing/.../perf/clients). The simplest ones are the single record 
operation clients and the bank_tx load.

In my opinion, the primary use cases for the current in-memory back end 
are testing and development. In the next release it may be better suited 
for storing purely transient data in a production environment as well 
(with a proper delete mechanism and maybe a size limit feature).


-- 
Kristian

[1] 
https://issues.apache.org/jira/secure/attachment/12400859/derby-646-performance_comparison_1a.txt
>>
>> In aspect such as performance, totally memory consumption or reduce 
>> hardware cost?
>>
>>  
>>
>>      Do you see other embedded databases that also provide solution 
>> on the stripped-down CDC VM?
>>
> I don't think that H2 or HSQLDB run on CDC.
>
> Regards,
> -Rick
>>
>> Do you have any data point for Derby?
>>
>>  
>>
>> Thank you so much for shed some lights for people like me,
>>
>> Lily
>>
>>
>>
>> *From:* Rick Hillegas <Richard.Hillegas@Sun.COM>
>> *To:* Derby Discussion <derby-user@db.apache.org>
>> *Sent:* Wednesday, September 9, 2009 2:01:01 PM
>> *Subject:* Re: Derby in-memory back end - where to go next?
>>
>> Hi Lily,
>>
>> Some comments inline...
>>
>> Lily Wei wrote:
>> >
>> > Hi Rick:
>> >
>> >      Thank you so much for sharing the information with the group.
>> >
>> > >* It would be great to be able to bound the growth of the 
>> in-memory db
>> >
>> > Is there a trend for need of in-memory db on JAVA world?
>> >
>> I find that this consistently generates a lot of discussion whenever 
>> I talk about 10.5 features with users.
>> >
>> > Is it mainly for applications, i.e. ERP, CRM, SRM?
>> >
>> The top use-cases which keep coming up are:
>>
>> o Middle-tier caching -- here people use Derby in the middle tier in 
>> order to scale out access to a big back end like Oracle or DB2. 
>> Running in memory makes this perform even better.
>>
>> o Monitoring transient data streams - here you slice and dice the 
>> data while the monitoring application is up but you don't necessarily 
>> need to keep the data after the monitoring session ends.
>>
>> o Test rigs -- here you can use Derby on your laptop to run 
>> regression tests against an application which will run in production 
>> on a big back end like Oracle or DB2; the rig is lightweight and 
>> cleans up after itself.
>> >
>> > What kind of solution JAVA can provide for smart device like 
>> iPhone, RIMM or Plam? i.e. Will JAVA play well with WindowMobile or 
>> Arnoid?
>> >
>> Our small device story is our ability to run on the stripped-down CDC 
>> VM. Being able to run completely in memory gives this story extra 
>> appeal too.
>>
>> Thanks,
>> -Rick
>> >
>> > > > Thank you for shed the lights for us in advance,
>> >
>> > Lily
>> >
>> >
>> > *From:* Rick Hillegas <Richard.Hillegas@Sun.COM 
>> <mailto:Richard.Hillegas@Sun.COM>>
>> > *To:* Derby Discussion <derby-user@db.apache.org 
>> <mailto:derby-user@db.apache.org>>
>> > *Sent:* Wednesday, September 9, 2009 11:13:05 AM
>> > *Subject:* Re: Derby in-memory back end - where to go next?
>> >
>> > Hi Kristian,
>> >
>> > Here's another piece of feedback: Last night I gave an overview of 
>> Derby to the San Francisco Java User's Group. A developer asked 
>> whether the growth of the in-memory database could be bounded. He had 
>> a use case which we didn't explore in depth but which involved 
>> periodically truncating the database. I asked him to bring his 
>> requirements to the Derby user list so that we could feed them into 
>> your spec effort. Here are my takeaways:
>> >
>> > * It would be great to be able to bound the growth of the in-memory db
>> >
>> > * It would be great if the memory occupied by deleted records could 
>> be released
>> >
>> > Thanks,
>> > -Rick
>> >
>> > Kristian Waagan wrote:
>> > > Hello,
>> > >
>> > > In Derby 10.5 an in-memory back end, or storage engine, was 
>> included. It stores all the data in main memory, with the exception 
>> of derby.log. If this is news to you, and you want a quick intro to 
>> it, see [1] and [2].
>> > >
>> > > I'm trying to gather some feedback on whether the current 
>> implementation is found acceptable, or if there are additional 
>> features people would like to see. I expect some wishes to emerge, 
>> and I plan to record these on the wiki page [1]. The page can then be 
>> used to guide further work in this area.
>> > >
>> > > To start the discussion, I'll list some potential features and 
>> tasks. Feel free to comment on any one of them either by replying to 
>> this thread, or by adding your comments to [1]. It can be a +1 or -1 
>> on the feature itself, a suggestion for a new feature, or details on 
>> what a feature should look like.
>> > >
>> > >
>> > > * Documentation
>> > > Must at least document the JDBC subsubprotocol, and also explain 
>> how to delete in-memory databases.
>> > > If new features are added, these must be documented as well.
>> > >
>> > > * Deletion of in-memory databases
>> > > Currently the only ways to delete an in-memory database are to 
>> restart the JVM or use a static method that isn't part of Derby's 
>> public API. A proper mechanism for deletion should be added.
>> > >
>> > > * Automatic deletion on database shutdown (or when last 
>> connection disconnects)
>> > >
>> > > * "Anonymous in-memory databases"
>> > > A database which only the connection creating it can access, and 
>> when the connection goes away the database goes away.
>> > >
>> > > * Automatic persistence
>> > > The database could be persisted to disk automatically based on 
>> certain criteria. The most obvious ones are perhaps on a fixed 
>> interval and on JVM shutdown.
>> > >
>> > > * Monitoring
>> > > The most basic information is how many in-memory databases exist 
>> in the current JVM, and how big they are. How should this information 
>> be presented? Should it be available to anyone having a connection to 
>> the current JVM?
>> > >
>> > > * No derby.log
>> > > Include a class in Derby that will discard everything written to 
>> derby.log.
>> > >
>> > >
>> > > Thank you for your feedback,
>> >
>> >
>>
>>
>


Mime
View raw message