db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suresh Thalamati <suresh.thalam...@gmail.com>
Subject Re: [jira] Created: (DERBY-888) improve performance of page allocation
Date Mon, 30 Jan 2006 19:43:10 GMT
I am not sure rollforward recovery fixes a page if there is a garbage 
on the page. I think there is a small difference between the 
rollforward from backup and the not syncing on allocation  case u are 
trying to handle. During  rollforward recovery from the backup , redo 
is happening on the the container from the backup. During redo if 
page is not found  means it was not allocated before, so it just goes 
and creates new page and does a sync. I think rollforward recovery 
will never see a garbage or a old valid page.

Without sync on allocation, How to make sure recovery never sees good 
old page on a new allocation  because some OS/Hardware does zero out a 
new page to a file that was used earlier by the same file or some 
other file ?. I think in most generic Operating Systems one will never 
give the user a old page on new page allocation to the file. But I am 
not sure how it works on small devices with FLASH memory ..etc.


Thanks
-suresht

Mike Matrigali wrote:
> This first step is not going all the way to buffering with no
> file system interaction.  The system is still going to request
> the space from the file before allowing the insert.  In java
> the way you do this is write the file to the OS.  What I am
> changing is that we use to sync the empty page.  This means the
> user will get the normal immediate feedback if it is his inserts
> that cause the filesystem to fill up.
> 
> The current system already has the sync every 8 page optimization,
> rather than every page.  (I think it actually sync's every page
> until there are 8 and then does 8 at a time - a left over from
> when it was important to conserve disk space for running on small
> systems and many apps had tables less than 8 pages).  I considered
> some sort of dynamic changing of the preallocate size, but it
> seemed too complicated - and the bigger the preallocate grew
> the more likely we would grow the file TOO much.
> 
> As you say the sync every N seconds is like a checkpoint.  In
> the current system checkpoint will sync every file so the sync
> will happen at that point no matter what.
> 
> We did the sync for 2 reasons.  One was that we use to not be
> able to handle redo recovery of version 0 of the page if the
> system read "garbage" from the disk.  This was fixed by
> requirements of recent rollforward recovery project.  The
> system can now handle redo where it reads
> garbage from disk while creating version 0 of the page, and
> it also handles if trying to read page and finds it needs
> to extend the file.
> 
> The second is that we guessed that some OS might require the
> sync to insure the space on disk.  There is no info one way or
> the other in the JVM documentation so it was just a guess on
> our part.  My belief is that the write call we are doing will
> force the OS to allocate the space to our file, and no other
> file will be able use that space, so the space is ours until
> OS/machine crashes at least.
> 
> I think in general users will almost never see out of space
> during redo recovery.  I think it might take an OS crash,
> on an OS with no filesystem logging, and a subsequent process
> using up al the disk space on the machine before derby gets
> to run redo.
> 
> So the upside is inserts go much faster.  The downside is that
> in some rare (and maybe on some/most OS's never) cases the user
> will see an out of disk space message during database boot that
> tells him that he has to free up some disk space.  The system
> will boot once the disk space is available.  This error exists
> today if the disk is full, and undo needs to write some CLR's -
> so the error isn't even new for derby.
> 
> 
> 
> Bryan Pendleton wrote:
> 
>> Mike Matrigali (JIRA) wrote:
>>
>>> ... the total time aproaches very close to durability=test ...
>>
>>
>>
>> Wow! This is great; looks like a very big win. Cool!
>>
>> I had two questions:
>>
>> 1) It seems like a really common scenario would be:
>>    - a single enormous "batch" application is trying to insert
>>      many, many rows into a table.
>>    - there's enough room in memory, so we just buffer up a bunch
>>      of pages in the cache and let the application insert as it pleases.
>>    - the application completes, and commits,
>>    - then we discover there's not enough space on the disk.
>>
>> Is this the problem you're trying to solve?
>>
>> If so, I'm a little confused as to what the "external view" of the
>> system will be -- how will the user know that the disk has become
>> full and that space needs to be added?
>>
>> 2) Is there any advantage that you can see to have some sort of
>>    intermediate behavior in between the extremes of:
>>    - always sync every freshly allocated page, and
>>    - never sync freshly allocated pages
>>
>> For example, is there any point in a "sync every N pages", or
>> "sync every N seconds"? (I guess the latter is sort of like a 
>> checkpoint?)
>>
>> thanks,
>>
>> bryan
>>
>>
>>
>>
> 
> 


Mime
View raw message