db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Øystein Grøvlen <Oystein.Grov...@Sun.COM>
Subject Re: Discussion of incremental checkpointing----Added some new content
Date Tue, 14 Feb 2006 12:14:33 GMT
Mike Matrigali wrote:
> Øystein Grøvlen wrote:
>> Mike Matrigali wrote:
>>> I think my main issue is that I don't see that it is important to
>>> optimize writing the cached dirty data.  Especially since the order
>>> that you are proposing writing the dirty data is exactly the wrong
>>> order to the current cache performance goal to minimize the number of 
>>> total I/O's the
>>> system is going to do (a page that is the oldest written exists in
>>> a busy cache most likely because it has been written many times -
>>> otherwise the standard background I/O thread would have written
>>> it already).
>> I think your logic is flawed if you are talking about checkpointing 
>> (and not the background writer).  If you want to guarantee a certain 
>> recovery time, you will need to write the oldest page.  Otherwise, you 
>> will not be able to advance the starting point for recovery.  This 
>> approach to checkpointing should reduce the number of I/Os since you 
>> are not writing a busy page until it is absolutely necessary.  The 
>> current checkpointing writes a lot of pages which does not do anything 
>> to make it possible to garbage-collect log. Those pages should be left 
>> to the background writer, which can use its own criteria for which 
>> pages are optimal to write.
> I guess I was not clear, I agree with you:
>     checkpoint - wants to write oldest page, I agree this is necessary 
> to move the redo low water mark.
>     background - wants to write least used, probably not oldest page.
> What pages are you talking about that the current checkpoint process 
> writes that are not necessary.  Are they the ones that go from clean
> to dirty after the checkpoint starts?  It seems that in current 
> checkpoint all pages dirty at the start are necessary to move the redo
> low water mark.

It is not necessary to write all pages to be able to move the redo low 
watermark forward.  It is necessary to write all pages to move the redo 
low water all the way up to the new checkpoint log record.  However, 
that will probably give a much lower recovery time than what we are 
aiming for.  Hence, we can skip writing the newer pages and still be 
within the requested recovery time.


>> I think we SHOULD sync for every I/O, but not the way we do today.  By 
>> opening the files with "rwd", we should be able to do this pretty 
>> efficiently already today.  (At least on some systems.  I am not sure 
>> about non-POSIX systems like windows.)  Syncing for every I/O gives us 
>> much more control over the I/O, and we will not be vulnerable to 
>> queuing effects that we do not control.
> Do you think we should sync for every I/O in the non-checkpoint case 
> also.  The case I am most interested in, is where a user transaction
> needs to wait for a page in the cache and the only way to give that
> page is by writing another page in the cache out.  Currently this write
> is async, are you proposing to change this to a sync write?

This scenario should be very rare.  If is not rare, async writing will 
probably just lead you into trouble over time since you will allow user 
threads to proceed at a rate that the file system will not be able to 
sustain in the long run.  Also, see my reply to Suresh where I discuss a 
way this could be handle so it is still async with respect to user threads.


View raw message