db-derby-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geoff hendrey <geoff_hend...@yahoo.com>
Subject Re: Many background threads (rawStoreDaemon)
Date Mon, 04 May 2009 14:49:57 GMT
Regarding the comment:

"Don't you think you'd want to revisit the design of a derby system opening
up 1000 databases at a time? With respect to a centralized mail archive
system, you do have other design choices, like a centralized server..."

One of the primary reasons for selecting derby for my application was that I could conceivably
run hundreds, even thousands of independent databases. Derby is fantastic (and unique), in
the ability that we have, from java, to create databases on the fly, open them, shut them
down, etc. We can encrypt the physical files, and keep them isolated from each other. This
has many advantages for an application service provider, who hosts many customers. And in
fact, it's the primary reason we choose derby. 

Last year at JavaOne I discussed this with Knut, and the only thing that seemed like a problem
was an issue with the page cache, which I was able to address via a config property. Therefore,
I'd say this request is quite in keeping with the nature of a smal flexible database; essential
to keep all elements of its footprint tiny: memory, disk etc. You may have seen some posts
from me in the past requesting minimization of the number of files created when derby starts,
but that is less important than memory and threads.

 -geoff
“XML? Too much like HTML. It'll never work on the Web!” 
-anonymous 





________________________________
From: Michael Segel <msegel@segel.com>
To: Derby Discussion <derby-user@db.apache.org>
Sent: Monday, May 4, 2009 6:50:49 AM
Subject: RE: Many background threads (rawStoreDaemon)



> -----Original Message-----
> From: Arnaud Masson [mailto:amasson@gmail.com]
> Sent: Monday, May 04, 2009 8:04 AM
> To: Derby Discussion
> Subject: Re: Many background threads (rawStoreDaemon)
> 
> derby@segel.com a écrit :
> >
> >> -----Original Message-----
> >>
> >>>> I have an application that opens several small derby databases (using
> >>>> EmbeddedDriver).
> >>>> Most of of these instances don't handle many requests.
> >>>> The problem is that each instance maintain its own thread
> >>>> "derby.rawStoreDaemon".
> >>>> Is there a way to use a thread pool shared between instances instead
> ?
> >>>> (There is also a discussion about that here:
> >>>> http://osdir.com/ml/apache.db.derby.devel/2005-04/msg00093.html)
> >>>>
> >>> I am not aware any progress has been made on this issue,
> unfortunately.
> >>>
> >> If you think it should be adressed, filing a JIRA issue is a good first
> >> step: https://issues.apache.org/jira/browse/DERBY
> >>
> >> (I don't think there is one filed yet, is it?)
> >>
> >
> > Ok so if I understand this...
> >
> > A person wants to open multiple embedded copies of a database and then
> > thread pool the database server connections/controllers of the
> individual
> > database connections.
> >
> > Ok, yes there is some merit to this. Unfortunately, you're adding a lot
> of
> > weight to the code that means a larger footprint making it harder to
> embed.
> >
> >
> This is not about connection pool.
> This is about the fact that each derby instance maintain a thread that
> is sleeping most of the time (rawStoreDaemon).
> It would be better for the footprint to free daemon threads when they
> are not used or to recycle them in a pool (like JDK Executor).
> 

I think you misunderstand.

The footprint isn't the code running but the jar file being downloaded.

Look, I think everyone agrees that on the micro level, there are always
fixes that should go in to the code.

My point is that on a macro level, a lot of small code enhancements add up. 
You're increasing the footprint of the code base which makes it more
difficult to embed in to some applications. Derby was a lightweight
database. Things like raw partitions, fragmented tables, etc ... weren't an
issue.

Is anyone thinking of these things on a macro level? And what direction do
you want to take Derby? Do you want to compete against MySQL or Postgres for
mindshare?

Do you want to build in shared nothing features for distributed queries and
make derby a data warehouse engine? 

My suggestion is to review derby and determine how you can satisfy both
needs.

Does that make sense?

-Mike

PS. The example: " The issue is that if a Derby system is
booted with 1,000 databases, then 1,000 backgound threads are created,
all just to handle background work. The to-list item was intended change
the DaemonFactory from being booted per-database to being booted
per-system. Thus a single backgound thread would handle the idle
requests. Then, I agree, the DaemonFactory could be enhanced to support
a handful of worker threads behind the interface, without the knowledge
of the caller."

Don't you think you'd want to revisit the design of a derby system opening
up 1000 databases at a time? With respect to a centralized mail archive
system, you do have other design choices, like a centralized server...
Mime
View raw message