db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: [jira] Created: (DERBY-888) improve performance of page allocation
Date Sat, 28 Jan 2006 23:29:53 GMT
This first step is not going all the way to buffering with no
file system interaction.  The system is still going to request
the space from the file before allowing the insert.  In java
the way you do this is write the file to the OS.  What I am
changing is that we use to sync the empty page.  This means the
user will get the normal immediate feedback if it is his inserts
that cause the filesystem to fill up.

The current system already has the sync every 8 page optimization,
rather than every page.  (I think it actually sync's every page
until there are 8 and then does 8 at a time - a left over from
when it was important to conserve disk space for running on small
systems and many apps had tables less than 8 pages).  I considered
some sort of dynamic changing of the preallocate size, but it
seemed too complicated - and the bigger the preallocate grew
the more likely we would grow the file TOO much.

As you say the sync every N seconds is like a checkpoint.  In
the current system checkpoint will sync every file so the sync
will happen at that point no matter what.

We did the sync for 2 reasons.  One was that we use to not be
able to handle redo recovery of version 0 of the page if the
system read "garbage" from the disk.  This was fixed by
requirements of recent rollforward recovery project.  The
system can now handle redo where it reads
garbage from disk while creating version 0 of the page, and
it also handles if trying to read page and finds it needs
to extend the file.

The second is that we guessed that some OS might require the
sync to insure the space on disk.  There is no info one way or
the other in the JVM documentation so it was just a guess on
our part.  My belief is that the write call we are doing will
force the OS to allocate the space to our file, and no other
file will be able use that space, so the space is ours until
OS/machine crashes at least.

I think in general users will almost never see out of space
during redo recovery.  I think it might take an OS crash,
on an OS with no filesystem logging, and a subsequent process
using up al the disk space on the machine before derby gets
to run redo.

So the upside is inserts go much faster.  The downside is that
in some rare (and maybe on some/most OS's never) cases the user
will see an out of disk space message during database boot that
tells him that he has to free up some disk space.  The system
will boot once the disk space is available.  This error exists
today if the disk is full, and undo needs to write some CLR's -
so the error isn't even new for derby.



Bryan Pendleton wrote:
> Mike Matrigali (JIRA) wrote:
> 
>> ... the total time aproaches very close to durability=test ...
> 
> 
> Wow! This is great; looks like a very big win. Cool!
> 
> I had two questions:
> 
> 1) It seems like a really common scenario would be:
>    - a single enormous "batch" application is trying to insert
>      many, many rows into a table.
>    - there's enough room in memory, so we just buffer up a bunch
>      of pages in the cache and let the application insert as it pleases.
>    - the application completes, and commits,
>    - then we discover there's not enough space on the disk.
> 
> Is this the problem you're trying to solve?
> 
> If so, I'm a little confused as to what the "external view" of the
> system will be -- how will the user know that the disk has become
> full and that space needs to be added?
> 
> 2) Is there any advantage that you can see to have some sort of
>    intermediate behavior in between the extremes of:
>    - always sync every freshly allocated page, and
>    - never sync freshly allocated pages
> 
> For example, is there any point in a "sync every N pages", or
> "sync every N seconds"? (I guess the latter is sort of like a checkpoint?)
> 
> thanks,
> 
> bryan
> 
> 
> 
> 


Mime
View raw message