db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raymond Raymond" <raymond_de...@hotmail.com>
Subject Re: Discussion of how to map the recovery time into Xmb of log --Checkpoint issue
Date Wed, 25 Jan 2006 21:36:50 GMT
Mike, last time you gave me some comments about how to map the
recovery time into Xmb of log. I still have some question about it.

RR    2. During initilization of Derby, we run some measurement that
RR       determines the performance of the system and maps the
RR       recovery time into some X megabytes of log.

RR  Now, I am going to design the 2nd step first to map the recovery
RR  time into some
RR  X megabytes of log. A simple approach is that we can design a test
RR  log file. In the
RR  log file, we can let derby create a temporary database and do a
RR  bunch of test to get
RR  necessary disk IO information, and then delete the temporary
RR  database. When derby
RR  boots up, we let it to do recovery from the test log file.Anyone
RR  has some other
RR  suggestions on it?
MM  I'll think about this, it is not straight forward.  My guess would
MM  be that recovery time is dominated by 2 factors:
MM  1) I/O from log disk
MM  2) I/O from data disk
MM  Item 1 is pretty easy to get a handle on.  During redo it is pretty
MM  much
MM  a straight scan from beginning to end doing page based I/O.  Undo is
MM  harder as it jumps back and forth for each xact.  I would probably
MM  just
MM  ignore it for estimates.
MM  Item 2 is totally dependent on cache rate hit you are going to
MM  expect, and the number of log records.
MM  The majority of log records deal with a single page, it will read
MM  the page into cache if it doesn't exist and then it will do a quick
MM  operation on that page.  Again undo is slightly more complicated as
MM  it
MM  could involve logical lookups in the index.
MM  Another option rather than do any sort of testing is to come up with
MM  an
MM  initial default time based on size of log file.  And then on each
MM  subsequent recovery event dynamically change the estimate based on
MM  how
MM  long that recovery on that db took.  This way each estimate will be
MM  based on the actual work generated by the application, and over time
MM  should become better and better.

I agree with you it is better to estimate with actual work generated by
the appplication. But, as I know, derby only performs recovery when it
boots(am I right here?). If Derby runs stably, the "subsequent recovery
event" you mentioned will not happen and we can't get the information we

RR  Raymond

Take advantage of powerful junk e-mail filters built on patented Microsoft® 
SmartScreen Technology. 

  Start enjoying all the benefits of MSN® Premium right now and get the 
first two months FREE*.

View raw message