db-derby-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From j..@apache.org
Subject svn commit: r151280 - in incubator/derby/site/trunk: build/site/papers/logformats.html src/documentation/content/xdocs/papers/logformats.xml
Date Fri, 04 Feb 2005 01:01:00 GMT
Author: jta
Date: Thu Feb  3 17:00:58 2005
New Revision: 151280

URL: http://svn.apache.org/viewcvs?view=rev&rev=151280
Log:
Commited changes to papers/logformats by Dibyendu Majumdar <dibyendu@mazumdar.demon.co.uk>.

Modified:
    incubator/derby/site/trunk/build/site/papers/logformats.html
    incubator/derby/site/trunk/src/documentation/content/xdocs/papers/logformats.xml

Modified: incubator/derby/site/trunk/build/site/papers/logformats.html
URL: http://svn.apache.org/viewcvs/incubator/derby/site/trunk/build/site/papers/logformats.html?view=diff&r1=151279&r2=151280
==============================================================================
--- incubator/derby/site/trunk/build/site/papers/logformats.html (original)
+++ incubator/derby/site/trunk/build/site/papers/logformats.html Thu Feb  3 17:00:58 2005
@@ -187,19 +187,29 @@
 	          &nbsp;<input value="+a" class="biggerfont" title="Enlarge text" onclick="ndeSetTextSize('incr'); return false;" type="button">
 </div>
 <h1>Derby Write Ahead Log Format</h1>
-<div class="abstract">This document describes the storage format of Derby Write Ahead Log. This is a work-in-progress derived from Javadoc comments 
-    and from explanations Mike Matrigali posted to the Derby lists. 
-    Please post questions, comments, and corrections to derby-dev@db.apache.org.
-    </div>
+<div class="abstract">This document describes the storage format of Derby Write Ahead 
+        Log. This is a work-in-progress derived from Javadoc comments and from 
+        explanations Mike Matrigali and others posted to the Derby lists. Please 
+        post questions, comments, and corrections to derby-dev@db.apache.org. 
+      </div>
 <div id="minitoc-area">
 <ul class="minitoc">
 <li>
 <a href="#introduction"> Introduction </a>
 </li>
 <li>
-<a href="#%0A%09%09%09Format+of+Write+Ahead+Log%0A%09%09">
-			Format of Write Ahead Log
-		</a>
+<a href="#+References+"> References </a>
+</li>
+<li>
+<a href="#Derby+implementation+of+the+Write+Ahead+Log">Derby implementation of the Write Ahead Log</a>
+<ul class="minitoc">
+<li>
+<a href="#LogCounter">LogCounter</a>
+</li>
+</ul>
+</li>
+<li>
+<a href="#+Format+of+Write+Ahead+Log+"> Format of Write Ahead Log </a>
 <ul class="minitoc">
 <li>
 <a href="#Format+of+Log+Control+File">Format of Log Control File</a>
@@ -215,415 +225,607 @@
 </li>
 </ul>
 </li>
+<li>
+<a href="#Pointers+to+relevant+classes">Pointers to relevant classes</a>
+</li>
 </ul>
-</div>
-    
+</div> 
+      
 <a name="N1000F"></a><a name="introduction"></a>
 <h2 class="boxed"> Introduction </h2>
 <div class="section">
-<p>
-        Derby implements the Write Ahead Log using a non-circular file system file.
-        At present, there is no support for incremental log backup or media recovery. 
-        Only crash recovery is supported.  
-		</p>
-<p>
-        The 'log' is a stream of log records.  The 'log' is implemented as
-        a series of numbered log files.  These numbered log files are logically
-        continuous so a transaction can have log records that span multiple log files.
-        A single log record cannot span more then one log file.  The log file number
-        is monotonically increasing.
-		</p>
-<p>
-        The log belongs to a log factory of a RawStore.  In the current implementation,
-        each RawStore only has one log factory, so each RawStore only has one log
-        (which composed of multiple log files).
-        At any given time, a log factory only writes new log records to one log file,
-        this log file is called the 'current log file'.
-		</p>
-<p>
-        A log file is named log<em>logNumber</em>.dat
-		</p>
-<p>
-        Everytime a checkpoint is taken, a new log file is created and all subsequent
-        log records will go to the new log file.  After a checkpoint is taken, old
-        and useless log files will be deleted.
-		</p>
-<p>
-        RawStore exposes a checkpoint method which clients can call, or a checkpoint is
-        taken automatically by the RawStore when:
-		</p>
-<ol>
-	      
-<li> The log file grows beyond a certain size (configurable, default 100K bytes)</li>
+<p> Derby uses a Write Ahead Log to record all changes to the database. 
+          The Write Ahead Log (WAL) protocol requires the following rules to be 
+          followed: </p>
+<ol> 
           
-<li> RawStore is shutdown and a checkpoint hasn't been done "for a while"</li>
+<li>A page must be latched exclusively before it can be updated.</li>
+          
+<li>While the latch is held, the update must be logged, and page must 
+            be tagged with the identity of the log record (often known as Log 
+            Sequence Number or LSN)</li>
           
-<li> RawStore is recovered and a checkpoint hasn't been done "for a while"</li>
+<li>When the page is about to be written to persistent storage, all 
+            logs records up to and including the page's LSN, must be forced to 
+            disk.</li>
           
+<li>Once the log records have been forced to disk, the cached page may 
+            be written to persistent storage, overwriting the previous version 
+            of the page.</li>
+        
 </ol>
+<p>The WAL protocol ensures that in the event of a system crash, databases 
+          pages can be restored to a consistent state using the information contained 
+          in the log records. How this is done will be the subject of another 
+          paper.</p>
+</div>
+      
+<a name="N1002B"></a><a name="+References+"></a>
+<h2 class="boxed"> References </h2>
+<div class="section">
+<p> A good description of Write Ahead Logging, and how a log is typically 
+          implemented, can be found in 
+          <em> 
+            <a class="external" href="http://portal.acm.org/citation.cfm?id=573304">Transaction 
+              Processing: Concepts and Techniques</a>
+            , by Jim Gray and Andreas Reuter, 1993, Morgan Kaufmann Publishers</em>
+          .</p>
 </div>
-	
-<a name="N10037"></a><a name="%0A%09%09%09Format+of+Write+Ahead+Log%0A%09%09"></a>
-<h2 class="boxed">
-			Format of Write Ahead Log
-		</h2>
+      
+<a name="N1003C"></a><a name="Derby+implementation+of+the+Write+Ahead+Log"></a>
+<h2 class="boxed">Derby implementation of the Write Ahead Log</h2>
 <div class="section">
-<p>
-	      An implementation of file based log is <span class="codefrag">org.apache.derby.impl.store.raw.log.LogToFile</span>.
-		This LogFactory is responsible for the formats of 2 kinds of file: the log
-		file and the log control file.  And it is responsible for the format of the
-		log record wrapper.
-		</p>
-<a name="N10043"></a><a name="Format+of+Log+Control+File"></a>
+<p> Derby implements the Write Ahead Log using a non-circular file system 
+          file. Here are some comments about current implementation of recovery:</p>
+<p class="quote">
+          
+<em>Suresh Thalamati</em>
+<br>
+          Derby supports simple media recovery. It has support for full backup/restore 
+          and very basic form of rollforward recovery (replay of logs using backup 
+          and archived log files). </p>
+<p class="quote">
+          
+<em>Mike Matrigali</em>
+<br>
+            1. Derby fully supports crash recovery, it uses java to correctly 
+              sync the log file to support this.<br>
+            2. I would say derby supports media recovery. One can make a backup 
+              of the data and store it off line. Logs can be stored on a separate 
+              disk from the data, and if you lose your data disk then you can 
+              use rollforward recovery on the existing logs and the copy of the 
+              backup to bring your database up to the current point in time.<br>
+            3. Derby does not support "point in time recovery". Someone may want 
+              to look at this in the future. Technically I don't think it would 
+              be very hard as the logging system has the stuff to solve the hard 
+              problems. It does not have an idea about "time" - it just knows 
+              log sequence numbers, so need to figure out what kind of interface 
+              a user really wants. A very user unfriendly interface would not 
+              be very hard to implement which would be recover to a specific log 
+              sequence number. Anyone interested in this feature should add it 
+              to jira - I'll be happy to add technical comments on what needs 
+              to be done.<br>
+            4. A reasonable next step in derby recovery progress would be to 
+              add a way to automatically move/copy log files offline as they are 
+              not needed by crash recovery and only needed for media recovery. 
+              Some sort of java stored procedure callout would seem most appropriate.
+        </p>
+<p> The 'log' is a stream of log records. The 'log' is implemented as 
+          a series of numbered log files. These numbered log files are logically 
+          continuous so a transaction can have log records that span multiple 
+          log files. A single log record cannot span more than one log file. The 
+          log file number is monotonically increasing. </p>
+<p> The log belongs to a log factory of a RawStore. In the current implementation, 
+          each RawStore only has one log factory, so each RawStore only has one 
+          log (which is composed of multiple log files). At any given time, a 
+          log factory only writes new log records to one log file, this log file 
+          is called the 'current log file'. </p>
+<p> A log file is named log 
+          <em>logNumber</em>
+          .dat </p>
+<p>With the default values, a new log file is created (this is known as 
+          log switch) when a log file grows beyond 1MB and a checkpoint happens 
+          when the amount of log written is 10MB or more from the last checkpoint.</p>
+<p> RawStore exposes a checkpoint method which clients can call, or a 
+          checkpoint is taken automatically by the RawStore when: </p>
+<ol> 
+          
+<li> The log file grows beyond a certain size (configurable, default 
+            1MB)</li>
+          
+<li> RawStore is shutdown and a checkpoint hasn't been done "for a while"</li>
+          
+<li> RawStore is recovered and a checkpoint hasn't been done "for a 
+            while"</li>
+        
+</ol>
+<a name="N1007B"></a><a name="LogCounter"></a>
+<h3 class="boxed">LogCounter</h3>
+<p>Log records are identified using LogCounter, which is an implementation 
+            of LogInstant, a Derby term for LSN. The LogCounter is made up of 
+            the log file number, and the byte offset of the log record within 
+            the log file. Within the stored log record a log counter is represented 
+            as a long. Outside the LogFactory the instant is passed around as 
+            a LogCounter (through its LogInstant interface).</p>
+<p> The way the long is encoded is such that &lt; == &gt; correctly 
+            tells if one log instant is lessThan, equals or greater than another.</p>
+</div>
+      
+<a name="N10089"></a><a name="+Format+of+Write+Ahead+Log+"></a>
+<h2 class="boxed"> Format of Write Ahead Log </h2>
+<div class="section">
+<p> An implementation of file based log is in 
+          <span class="codefrag">org.apache.derby.impl.store.raw.log.LogToFile</span>. 
+          This LogFactory is responsible for the formats of 2 kinds of file: 
+          the log file and the log control file. And it is responsible for the 
+          format of the log record wrapper. </p>
+<a name="N10095"></a><a name="Format+of+Log+Control+File"></a>
 <h3 class="boxed">Format of Log Control File</h3>
-<p>The log control file contains information about which log files
-			   are present and where the last checkpoint log record is located.</p>
-<table class="ForrestTable" cellspacing="1" cellpadding="4">
-				
-<tr>
-					
+<p>The log control file contains information about which log files are 
+            present and where the last checkpoint log record is located.</p>
+<table class="ForrestTable" cellspacing="1" cellpadding="4"> 
+            
+<tr> 
+              
 <th colspan="1" rowspan="1">Type</th>
-					<th colspan="1" rowspan="1">Desciption</th>
-				
+              <th colspan="1" rowspan="1">Desciption</th>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">format id set to FILE_STREAM_LOG_FILE</td>
-				
+              <td colspan="1" rowspan="1">format id set to FILE_STREAM_LOG_FILE</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">obsolete log file version</td>
-				
+              <td colspan="1" rowspan="1">obsolete log file version</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">long</td>
-					<td colspan="1" rowspan="1">the log instant (LogCounter) of the last completed checkpoint</td>
-				
+              <td colspan="1" rowspan="1">the log instant (LogCounter) of the last completed checkpoint</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">JBMS (older name for Cloudscape/Derby) version</td>
-				
+              <td colspan="1" rowspan="1">JBMS (older name for Cloudscape/Derby) version</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">checkpoint interval</td>
-				
+              <td colspan="1" rowspan="1">checkpoint interval</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">long</td>
-					<td colspan="1" rowspan="1">spare (value set to 0)</td>
-				
+              <td colspan="1" rowspan="1">spare (value set to 0)</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">long</td>
-					<td colspan="1" rowspan="1">spare (value set to 0)</td>
-				
+              <td colspan="1" rowspan="1">spare (value set to 0)</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">long</td>
-					<td colspan="1" rowspan="1">spare (value set to 0)</td>
-				
+              <td colspan="1" rowspan="1">spare (value set to 0)</td>
+            
 </tr>
-			
+          
 </table>
-<a name="N100C5"></a><a name="Format+of+the+log+file"></a>
+<a name="N10117"></a><a name="Format+of+the+log+file"></a>
 <h3 class="boxed">Format of the log file</h3>
-<p>The log file contains log records which record all the changes
-		    	to the database.  The complete transaction log is composed of a series of
-		    	log files.</p>
-<table class="ForrestTable" cellspacing="1" cellpadding="4">
-				
-<tr>
-					
+<p>The log file contains log records which record all the changes to 
+            the database. The complete transaction log is composed of a series 
+            of log files.</p>
+<table class="ForrestTable" cellspacing="1" cellpadding="4"> 
+            
+<tr> 
+              
 <th colspan="1" rowspan="1">Type</th>
-					<th colspan="1" rowspan="1">Description</th>
-				
+              <th colspan="1" rowspan="1">Description</th>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">Format id of this log file, set to FILE_STREAM_LOG_FILE.</td>
-				
+              <td colspan="1" rowspan="1">Format id of this log file, set to FILE_STREAM_LOG_FILE.</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">Obsolete log file version - not used</td>
-				
+              <td colspan="1" rowspan="1">Obsolete log file version - not used</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">long</td>
-					<td colspan="1" rowspan="1">Log file number - this number orders the log files in a
-						series to form the complete transaction log
-					</td>
-				
-</tr>		 
-				
-<tr>
-					
+              <td colspan="1" rowspan="1">Log file number - this number orders the log files in a series 
+                to form the complete transaction log </td>
+            
+</tr>
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">long</td>
-					<td colspan="1" rowspan="1">PrevLogRecord - log instant of the previous log record, in the
-	    				previous log file.</td>
-				
-</tr>
-				
-<tr>
-					
+              <td colspan="1" rowspan="1">PrevLogRecord - log instant of the previous log record, in the 
+                previous log file.</td>
+            
+</tr>
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">[log record wrapper]*</td>
-					<td colspan="1" rowspan="1">one or more log records with wrapper</td>
-				
+              <td colspan="1" rowspan="1">one or more log records with wrapper</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">EndMarker - value of zero.  The beginning of a log record wrapper
-						is the length of the log record, therefore it is never zero
-					</td>
-				
-</tr>
-				
-<tr>
-					
+              <td colspan="1" rowspan="1">EndMarker - value of zero. The beginning of a log record wrapper 
+                is the length of the log record, therefore it is never zero </td>
+            
+</tr>
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">[int fuzzy end]*</td>
-					<td colspan="1" rowspan="1">zero or more int's of value 0, in case this log file
-					has been recovered and any incomplete log record set to zero.
-					</td>
-				
+              <td colspan="1" rowspan="1">zero or more int's of value 0, in case this log file has been 
+                recovered and any incomplete log record set to zero. </td>
+            
 </tr>
-			
+          
 </table>
-<a name="N1013A"></a><a name="Format+of+the+log+record+wrapper"></a>
+<a name="N1018C"></a><a name="Format+of+the+log+record+wrapper"></a>
 <h3 class="boxed">Format of the log record wrapper</h3>
 <p>The log record wrapper provides information for the log scan.</p>
-<table class="ForrestTable" cellspacing="1" cellpadding="4">
-				
-<tr>
-					
+<table class="ForrestTable" cellspacing="1" cellpadding="4"> 
+            
+<tr> 
+              
 <th colspan="1" rowspan="1">Type</th>
-					<th colspan="1" rowspan="1">Description</th>
-				
+              <th colspan="1" rowspan="1">Description</th>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">length - length of the log record (for forward scan)</td>
-				
+              <td colspan="1" rowspan="1">length - length of the log record (for forward scan)</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">long</td>
-					<td colspan="1" rowspan="1">instant - LogInstant of the log record</td>
-				
+              <td colspan="1" rowspan="1">instant - LogInstant of the log record</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">byte[length]</td>
-					<td colspan="1" rowspan="1">logRecord - byte array that is written by the FileLogger</td>
-				
+              <td colspan="1" rowspan="1">logRecord - byte array that is written by the FileLogger</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">length - length of the log record (for backward scan)</td>
-				
+              <td colspan="1" rowspan="1">length - length of the log record (for backward scan)</td>
+            
 </tr>
-			
+          
 </table>
-<a name="N10188"></a><a name="The+format+of+a+log+record"></a>
+<a name="N101DA"></a><a name="The+format+of+a+log+record"></a>
 <h3 class="boxed">The format of a log record</h3>
 <p>The log record described every change to the persistent store</p>
-<table class="ForrestTable" cellspacing="1" cellpadding="4">
-				
-<tr>
-					
+<table class="ForrestTable" cellspacing="1" cellpadding="4"> 
+            
+<tr> 
+              
 <th colspan="1" rowspan="1">Type</th>
-					<th colspan="1" rowspan="1">Description</th>
-				
+              <th colspan="1" rowspan="1">Description</th>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">int</td>
-					<td colspan="1" rowspan="1">format_id, set to LOG_RECORD. The formatId is written by FormatIdOutputStream 
-                                  when this object is	written out by writeObject
-					</td>
-				
-</tr>
-				
-<tr>
-					
+              <td colspan="1" rowspan="1">format_id, set to LOG_RECORD. The formatId is written by FormatIdOutputStream 
+                when this object is written out by writeObject </td>
+            
+</tr>
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">CompressedInt</td>
-					<td colspan="1" rowspan="1">
-<p>loggable group - the loggable's group value.</p>
-						
-<p>	
-						Each loggable belongs to one or more groups of similar functionality.
-						</p>
-						
-<p>
-						Grouping is a way to quickly sort out log records that are interesting
-						to different modules or different implementations.
-						</p>
-						
-<p>
-						When a module makes loggable and sent it to the log file, it must mark
-						this loggable with one or more of the following group. 
-						If none fit, or if the loggable encompasses functionality that is not
-						described in existing groups, then a new group should be introduced.  
-						</p>
-						
-<p>
-						Grouping has no effect on how the record is logged or how it is treated
-						in rollback or recovery.
-						</p>
-						
-<p>
-						The following groups are defined. This list serves as the registry of
-						all loggable groups.
-						</p>
-						
-<table class="ForrestTable" cellspacing="1" cellpadding="4">
-							
+              <td colspan="1" rowspan="1"> 
+<p>loggable group - the loggable's group value.</p> 
+<p> Each 
+                  loggable belongs to one or more groups of similar functionality. 
+                </p> 
+<p> Grouping is a way to quickly sort out log records that 
+                  are interesting to different modules or different implementations. 
+                </p> 
+<p> When a module makes loggable and sent it to the log file, 
+                  it must mark this loggable with one or more of the following 
+                  group. If none fit, or if the loggable encompasses functionality 
+                  that is not described in existing groups, then a new group should 
+                  be introduced. </p> 
+<p> Grouping has no effect on how the record 
+                  is logged or how it is treated in rollback or recovery. </p> 
+                
+<p> The following groups are defined. This list serves as the 
+                  registry of all loggable groups. </p> 
+<table class="ForrestTable" cellspacing="1" cellpadding="4"> 
+                  
 <caption>Loggable Groups</caption>
-							
-<tr>
-								
+                  
+<tr> 
+                    
 <th colspan="1" rowspan="1">Name</th>
-								<th colspan="1" rowspan="1">Value</th>
-								<th colspan="1" rowspan="1">Description</th>
-							
-</tr>
-							
-<tr>
-								
+                    <th colspan="1" rowspan="1">Value</th>
+                    <th colspan="1" rowspan="1">Description</th>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">FIRST</td>
-								<td colspan="1" rowspan="1">0x1</td>
-								<td colspan="1" rowspan="1">The first operation of a transaction.</td>
-							
-</tr>
-							
-<tr>
-								
+                    <td colspan="1" rowspan="1">0x1</td>
+                    <td colspan="1" rowspan="1">The first operation of a transaction.</td>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">LAST</td>
-								<td colspan="1" rowspan="1">0x2</td>
-								<td colspan="1" rowspan="1">The last operation of a transaction.</td>
-							
-</tr>
-							
-<tr>
-								
+                    <td colspan="1" rowspan="1">0x2</td>
+                    <td colspan="1" rowspan="1">The last operation of a transaction.</td>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">COMPENSATION</td>
-								<td colspan="1" rowspan="1">0x4</td>
-								<td colspan="1" rowspan="1">A compensation log record.</td>
-							
-</tr>
-							
-<tr>
-								
+                    <td colspan="1" rowspan="1">0x4</td>
+                    <td colspan="1" rowspan="1">A compensation log record.</td>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">BI_LOG</td>
-								<td colspan="1" rowspan="1">0x8</td>
-								<td colspan="1" rowspan="1">A BeforeImage log record.</td>
-							
-</tr>	
-							
-<tr>
-								
+                    <td colspan="1" rowspan="1">0x8</td>
+                    <td colspan="1" rowspan="1">A BeforeImage log record.</td>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">COMMIT</td>
-								<td colspan="1" rowspan="1">0x10</td>
-								<td colspan="1" rowspan="1">The transaction committed.</td>
-							
-</tr>
-							
-<tr>
-								
+                    <td colspan="1" rowspan="1">0x10</td>
+                    <td colspan="1" rowspan="1">The transaction committed.</td>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">ABORT</td>
-								<td colspan="1" rowspan="1">0x20</td>
-								<td colspan="1" rowspan="1">The transaction aborted.</td>
-							
-</tr>
-							
-<tr>
-								
+                    <td colspan="1" rowspan="1">0x20</td>
+                    <td colspan="1" rowspan="1">The transaction aborted.</td>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">PREPARE</td>
-								<td colspan="1" rowspan="1">0x40</td>
-								<td colspan="1" rowspan="1">The transaction prepared.</td>
-							
-</tr>
-							
-<tr>
-								
+                    <td colspan="1" rowspan="1">0x40</td>
+                    <td colspan="1" rowspan="1">The transaction prepared.</td>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">XA_NEEDLOCK</td>
-								<td colspan="1" rowspan="1">0x80</td>
-								<td colspan="1" rowspan="1">Need to reclaim locks associated with theis log record during XA prepared xact recovery.</td>
-							
-</tr>
-							
-<tr>
-								
+                    <td colspan="1" rowspan="1">0x80</td>
+                    <td colspan="1" rowspan="1">Need to reclaim locks associated with theis log record 
+                      during XA prepared xact recovery.</td>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">RAWSTORE</td>
-								<td colspan="1" rowspan="1">0x100</td>
-								<td colspan="1" rowspan="1">A log record generated by the raw store.</td>
-							
-</tr>
-							
-<tr>
-								
+                    <td colspan="1" rowspan="1">0x100</td>
+                    <td colspan="1" rowspan="1">A log record generated by the raw store.</td>
+                  
+</tr>
+                  
+<tr> 
+                    
 <td colspan="1" rowspan="1">FILE_RESOURCE</td>
-								<td colspan="1" rowspan="1">0x400</td>
-								<td colspan="1" rowspan="1">related to "non-transactional" files.</td>	
-							
+                    <td colspan="1" rowspan="1">0x400</td>
+                    <td colspan="1" rowspan="1">related to "non-transactional" files.</td>
+                  
 </tr>
-						
-</table>
-					
+                
+</table> 
 </td>
-				
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">TransactionId</td>
-					<td colspan="1" rowspan="1">xactId - The Transaction this log belongs to.</td>
-				
+              <td colspan="1" rowspan="1">xactId - The Transaction this log belongs to.</td>
+            
 </tr>
-				
-<tr>
-					
+            
+<tr> 
+              
 <td colspan="1" rowspan="1">Loggable</td>
-					<td colspan="1" rowspan="1">op - the log operation</td>
-				
+              <td colspan="1" rowspan="1">op - the log operation</td>
+            
 </tr>
-			
+          
 </table>
 </div>
-  
+      
+<a name="N10307"></a><a name="Pointers+to+relevant+classes"></a>
+<h2 class="boxed">Pointers to relevant classes</h2>
+<div class="section">
+<div class="frame fixme">
+<div class="label">Fixme (DM)</div>
+<div class="content">This section should link to appropriate Javadoc documentation</div>
+</div>
+<table class="ForrestTable" cellspacing="1" cellpadding="4">
+         
+<tr>
+             
+<th colspan="1" rowspan="1">Package</th>
+             <th colspan="1" rowspan="1">Class</th>
+             <th colspan="1" rowspan="1">Description</th>
+         
+</tr>
+         
+<tr>
+             
+<td colspan="1" rowspan="1">org.apache.derby.iapi.store.raw.log</td>
+             <td colspan="1" rowspan="1">LogFactory.java</td>
+             <td colspan="1" rowspan="1">The java interface for logging system module.</td>
+         
+</tr>
+         
+<tr>
+             
+<td colspan="1" rowspan="1">org.apache.derby.impl.store.raw.log</td>
+             <td colspan="1" rowspan="1">LogToFile.java</td>
+             <td colspan="1" rowspan="1">The implmentation of the LogFactory.java, also implementing Module,
+                 this is the one with recovery code.</td>
+         
+</tr>
+         
+<tr>
+             
+<td colspan="1" rowspan="1"></td>
+             <td colspan="1" rowspan="1">CheckpointOperation.java</td>
+             <td colspan="1" rowspan="1">A Log Operation that represents a checkpoint.</td>
+         
+</tr>
+         
+<tr>
+             
+<td colspan="1" rowspan="1"></td>
+             <td colspan="1" rowspan="1">FileLogger.java</td>
+             <td colspan="1" rowspan="1">Deals with putting log records to disk. Writes log records to a log file as a stream
+                (ie. log records added to the end of the file, no concept of pages).</td>
+         
+</tr>
+         
+<tr>
+             
+<td colspan="1" rowspan="1"></td>
+             <td colspan="1" rowspan="1">FlushedScan.java</td>
+             <td colspan="1" rowspan="1">Deals with scanning the log file. Scan the the log which is implemented by a series of log files.
+                 This log scan knows how to move across log file if it is positioned at
+                 the boundary of a log file and needs to getNextRecord.</td>
+         
+</tr>
+         
+<tr>
+             
+<td colspan="1" rowspan="1"></td>
+             <td colspan="1" rowspan="1">FlushedScanHandle.java</td>
+             <td colspan="1" rowspan="1">More stuff dealing with scanning the log file.</td>
+         
+</tr>
+         
+<tr>
+              
+<td colspan="1" rowspan="1"></td>
+              <td colspan="1" rowspan="1">Scan.java</td>
+              <td colspan="1" rowspan="1">More scan log file stuff. Scan the the log which is implemented by a series of log files.
+                This log scan knows how to move across log file if it is positioned at
+                the boundary of a log file and needs to getNextRecord.</td>
+         
+</tr>
+         
+<tr>
+              
+<td colspan="1" rowspan="1"></td>
+              <td colspan="1" rowspan="1">StreamLogScan.java</td>
+              <td colspan="1" rowspan="1">More scan log file stuff. LogScan provides methods to read a log record and get its LogInstant
+                  in an already defined scan.</td>
+         
+</tr>
+         
+<tr>
+             
+<td colspan="1" rowspan="1"></td>
+             <td colspan="1" rowspan="1">LogAccessFile.java</td>
+             <td colspan="1" rowspan="1">Lowest level putting log records to disk. Wraps a RandomAccessFile file to provide buffering
+                 on log writes.</td>
+         
+</tr>
+         
+<tr>
+             
+<td colspan="1" rowspan="1"></td>
+             <td colspan="1" rowspan="1">LogAccessFileBuffer.java</td>
+             <td colspan="1" rowspan="1">Utility for LogAccessFile. A single buffer of data.</td>
+         
+</tr>
+         
+<tr>
+              
+<td colspan="1" rowspan="1"></td>
+              <td colspan="1" rowspan="1">LogCounter.java</td>
+              <td colspan="1" rowspan="1">Log sequence number (LSN) implementation </td>
+         
+</tr>
+         
+<tr>
+              
+<td colspan="1" rowspan="1"></td>
+              <td colspan="1" rowspan="1">LogRecord.java</td>
+              <td colspan="1" rowspan="1">The log record written out to disk.</td>
+         
+</tr>
+         
+<tr>
+              
+<td colspan="1" rowspan="1"></td>
+              <td colspan="1" rowspan="1">ReadOnly.java</td>
+              <td colspan="1" rowspan="1">an alternate read only implementation of LogFactory</td>
+         
+</tr>
+         
+</table>
+</div>
+    
 </div>
 <!--+
     |end content

Modified: incubator/derby/site/trunk/src/documentation/content/xdocs/papers/logformats.xml
URL: http://svn.apache.org/viewcvs/incubator/derby/site/trunk/src/documentation/content/xdocs/papers/logformats.xml?view=diff&r1=151279&r2=151280
==============================================================================
--- incubator/derby/site/trunk/src/documentation/content/xdocs/papers/logformats.xml (original)
+++ incubator/derby/site/trunk/src/documentation/content/xdocs/papers/logformats.xml Thu Feb  3 17:00:58 2005
@@ -1,293 +1,425 @@
 <?xml version="1.0"?>
-<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
-<document> 
-  <header> 
-    <title>Derby Write Ahead Log Format</title>
-    <abstract>This document describes the storage format of Derby Write Ahead Log. This is a work-in-progress derived from Javadoc comments 
-    and from explanations Mike Matrigali posted to the Derby lists. 
-    Please post questions, comments, and corrections to derby-dev@db.apache.org.
-    </abstract>
-  </header>
-  <body>
-    <section id="introduction"> 
-      <title> Introduction </title>
-	    <p>
-        Derby implements the Write Ahead Log using a non-circular file system file.
-        At present, there is no support for incremental log backup or media recovery. 
-        Only crash recovery is supported.  
-		</p>
-        <p>
-        The 'log' is a stream of log records.  The 'log' is implemented as
-        a series of numbered log files.  These numbered log files are logically
-        continuous so a transaction can have log records that span multiple log files.
-        A single log record cannot span more then one log file.  The log file number
-        is monotonically increasing.
-		</p>
-        <p>
-        The log belongs to a log factory of a RawStore.  In the current implementation,
-        each RawStore only has one log factory, so each RawStore only has one log
-        (which composed of multiple log files).
-        At any given time, a log factory only writes new log records to one log file,
-        this log file is called the 'current log file'.
-		</p>
-		<p>
-        A log file is named log<em>logNumber</em>.dat
-		</p>
-        <p>
-        Everytime a checkpoint is taken, a new log file is created and all subsequent
-        log records will go to the new log file.  After a checkpoint is taken, old
-        and useless log files will be deleted.
-		</p>
-        <p>
-        RawStore exposes a checkpoint method which clients can call, or a checkpoint is
-        taken automatically by the RawStore when:
-		</p>
-	      <ol>
-	      <li> The log file grows beyond a certain size (configurable, default 100K bytes)</li>
+  <!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V2.0//EN" "http://forrest.apache.org/dtd/document-v20.dtd">
+  <document> 
+    <header> 
+      <title>Derby Write Ahead Log Format</title>
+      <abstract>This document describes the storage format of Derby Write Ahead 
+        Log. This is a work-in-progress derived from Javadoc comments and from 
+        explanations Mike Matrigali and others posted to the Derby lists. Please 
+        post questions, comments, and corrections to derby-dev@db.apache.org. 
+      </abstract>
+    </header>
+    <body> 
+      <section id="introduction"> 
+        <title> Introduction </title>
+        <p> Derby uses a Write Ahead Log to record all changes to the database. 
+          The Write Ahead Log (WAL) protocol requires the following rules to be 
+          followed: </p>
+        <ol> 
+          <li>A page must be latched exclusively before it can be updated.</li>
+          <li>While the latch is held, the update must be logged, and page must 
+            be tagged with the identity of the log record (often known as Log 
+            Sequence Number or LSN)</li>
+          <li>When the page is about to be written to persistent storage, all 
+            logs records up to and including the page's LSN, must be forced to 
+            disk.</li>
+          <li>Once the log records have been forced to disk, the cached page may 
+            be written to persistent storage, overwriting the previous version 
+            of the page.</li>
+        </ol>
+        <p>The WAL protocol ensures that in the event of a system crash, databases 
+          pages can be restored to a consistent state using the information contained 
+          in the log records. How this is done will be the subject of another 
+          paper.</p>
+      </section>
+      <section> 
+        <title> References </title>
+        <p> A good description of Write Ahead Logging, and how a log is typically 
+          implemented, can be found in 
+          <em> 
+            <a href="http://portal.acm.org/citation.cfm?id=573304">Transaction 
+              Processing: Concepts and Techniques</a>
+            , by Jim Gray and Andreas Reuter, 1993, Morgan Kaufmann Publishers</em>
+          .</p>
+      </section>
+      <section> 
+        <title>Derby implementation of the Write Ahead Log</title>
+        <p> Derby implements the Write Ahead Log using a non-circular file system 
+          file. Here are some comments about current implementation of recovery:</p>
+        <p class="quote">
+          <em>Suresh Thalamati</em><br/>
+          Derby supports simple media recovery. It has support for full backup/restore 
+          and very basic form of rollforward recovery (replay of logs using backup 
+          and archived log files). </p>
+        <p class="quote">
+          <em>Mike Matrigali</em><br/>
+            1. Derby fully supports crash recovery, it uses java to correctly 
+              sync the log file to support this.<br/>
+            2. I would say derby supports media recovery. One can make a backup 
+              of the data and store it off line. Logs can be stored on a separate 
+              disk from the data, and if you lose your data disk then you can 
+              use rollforward recovery on the existing logs and the copy of the 
+              backup to bring your database up to the current point in time.<br/>
+            3. Derby does not support "point in time recovery". Someone may want 
+              to look at this in the future. Technically I don't think it would 
+              be very hard as the logging system has the stuff to solve the hard 
+              problems. It does not have an idea about "time" - it just knows 
+              log sequence numbers, so need to figure out what kind of interface 
+              a user really wants. A very user unfriendly interface would not 
+              be very hard to implement which would be recover to a specific log 
+              sequence number. Anyone interested in this feature should add it 
+              to jira - I'll be happy to add technical comments on what needs 
+              to be done.<br/>
+            4. A reasonable next step in derby recovery progress would be to 
+              add a way to automatically move/copy log files offline as they are 
+              not needed by crash recovery and only needed for media recovery. 
+              Some sort of java stored procedure callout would seem most appropriate.
+        </p>
+        <p> The 'log' is a stream of log records. The 'log' is implemented as 
+          a series of numbered log files. These numbered log files are logically 
+          continuous so a transaction can have log records that span multiple 
+          log files. A single log record cannot span more than one log file. The 
+          log file number is monotonically increasing. </p>
+        <p> The log belongs to a log factory of a RawStore. In the current implementation, 
+          each RawStore only has one log factory, so each RawStore only has one 
+          log (which is composed of multiple log files). At any given time, a 
+          log factory only writes new log records to one log file, this log file 
+          is called the 'current log file'. </p>
+        <p> A log file is named log 
+          <em>logNumber</em>
+          .dat </p>
+        <!--
+        <p> Everytime a checkpoint is taken, a new log file is created and all 
+          subsequent log records will go to the new log file. After a checkpoint 
+          is taken, old and useless log files will be deleted. </p>
+-->
+        <p>With the default values, a new log file is created (this is known as 
+          log switch) when a log file grows beyond 1MB and a checkpoint happens 
+          when the amount of log written is 10MB or more from the last checkpoint.</p>
+        <p> RawStore exposes a checkpoint method which clients can call, or a 
+          checkpoint is taken automatically by the RawStore when: </p>
+        <ol> 
+          <li> The log file grows beyond a certain size (configurable, default 
+            1MB)</li>
           <li> RawStore is shutdown and a checkpoint hasn't been done "for a while"</li>
-          <li> RawStore is recovered and a checkpoint hasn't been done "for a while"</li>
-          </ol>
-    </section>
-	<section>
-		<title>
-			Format of Write Ahead Log
-		</title>
-		<p>
-	      An implementation of file based log is <code>org.apache.derby.impl.store.raw.log.LogToFile</code>.
-		This LogFactory is responsible for the formats of 2 kinds of file: the log
-		file and the log control file.  And it is responsible for the format of the
-		log record wrapper.
-		</p>
-		<section>
-			<title>Format of Log Control File</title>
-			<p>The log control file contains information about which log files
-			   are present and where the last checkpoint log record is located.</p>
-			<table>
-				<tr>
-					<th>Type</th>
-					<th>Desciption</th>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>format id set to FILE_STREAM_LOG_FILE</td>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>obsolete log file version</td>
-				</tr>
-				<tr>
-					<td>long</td>
-					<td>the log instant (LogCounter) of the last completed checkpoint</td>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>JBMS (older name for Cloudscape/Derby) version</td>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>checkpoint interval</td>
-				</tr>
-				<tr>
-					<td>long</td>
-					<td>spare (value set to 0)</td>
-				</tr>
-				<tr>
-					<td>long</td>
-					<td>spare (value set to 0)</td>
-				</tr>
-				<tr>
-					<td>long</td>
-					<td>spare (value set to 0)</td>
-				</tr>
-			</table>
-		</section>
-		<section>
-			<title>Format of the log file</title>
-			<p>The log file contains log records which record all the changes
-		    	to the database.  The complete transaction log is composed of a series of
-		    	log files.</p>
-			<table>
-				<tr>
-					<th>Type</th>
-					<th>Description</th>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>Format id of this log file, set to FILE_STREAM_LOG_FILE.</td>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>Obsolete log file version - not used</td>
-				</tr>
-				<tr>
-					<td>long</td>
-					<td>Log file number - this number orders the log files in a
-						series to form the complete transaction log
-					</td>
-				</tr>		 
-				<tr>
-					<td>long</td>
-					<td>PrevLogRecord - log instant of the previous log record, in the
-	    				previous log file.</td>
-				</tr>
-				<tr>
-					<td>[log record wrapper]*</td>
-					<td>one or more log records with wrapper</td>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>EndMarker - value of zero.  The beginning of a log record wrapper
-						is the length of the log record, therefore it is never zero
-					</td>
-				</tr>
-				<tr>
-					<td>[int fuzzy end]*</td>
-					<td>zero or more int's of value 0, in case this log file
-					has been recovered and any incomplete log record set to zero.
-					</td>
-				</tr>
-			</table>
-		</section>
-		<section>
-			<title>Format of the log record wrapper</title>
-			<p>The log record wrapper provides information for the log scan.</p>
-			<table>
-				<tr>
-					<th>Type</th>
-					<th>Description</th>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>length - length of the log record (for forward scan)</td>
-				</tr>
-				<tr>
-					<td>long</td>
-					<td>instant - LogInstant of the log record</td>
-				</tr>
-				<tr>
-					<td>byte[length]</td>
-					<td>logRecord - byte array that is written by the FileLogger</td>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>length - length of the log record (for backward scan)</td>
-				</tr>
-			</table>
-		</section>
-		<section>
-			<title>The format of a log record</title>
-			<p>The log record described every change to the persistent store</p>
-			<table>
-				<tr>
-					<th>Type</th>
-					<th>Description</th>
-				</tr>
-				<tr>
-					<td>int</td>
-					<td>format_id, set to LOG_RECORD. The formatId is written by FormatIdOutputStream 
-                                  when this object is	written out by writeObject
-					</td>
-				</tr>
-				<tr>
-					<td>CompressedInt</td>
-					<td><p>loggable group - the loggable's group value.</p>
-						<p>	
-						Each loggable belongs to one or more groups of similar functionality.
-						</p>
-						<p>
-						Grouping is a way to quickly sort out log records that are interesting
-						to different modules or different implementations.
-						</p>
-						<p>
-						When a module makes loggable and sent it to the log file, it must mark
-						this loggable with one or more of the following group. 
-						If none fit, or if the loggable encompasses functionality that is not
-						described in existing groups, then a new group should be introduced.  
-						</p>
-						<p>
-						Grouping has no effect on how the record is logged or how it is treated
-						in rollback or recovery.
-						</p>
-						<p>
-						The following groups are defined. This list serves as the registry of
-						all loggable groups.
-						</p>
-						<table>
-							<caption>Loggable Groups</caption>
-							<tr>
-								<th>Name</th>
-								<th>Value</th>
-								<th>Description</th>
-							</tr>
-							<tr>
-								<td>FIRST</td>
-								<td>0x1</td>
-								<td>The first operation of a transaction.</td>
-							</tr>
-							<tr>
-								<td>LAST</td>
-								<td>0x2</td>
-								<td>The last operation of a transaction.</td>
-							</tr>
-							<tr>
-								<td>COMPENSATION</td>
-								<td>0x4</td>
-								<td>A compensation log record.</td>
-							</tr>
-							<tr>
-								<td>BI_LOG</td>
-								<td>0x8</td>
-								<td>A BeforeImage log record.</td>
-							</tr>	
-							<tr>
-								<td>COMMIT</td>
-								<td>0x10</td>
-								<td>The transaction committed.</td>
-							</tr>
-							<tr>
-								<td>ABORT</td>
-								<td>0x20</td>
-								<td>The transaction aborted.</td>
-							</tr>
-							<tr>
-								<td>PREPARE</td>
-								<td>0x40</td>
-								<td>The transaction prepared.</td>
-							</tr>
-							<tr>
-								<td>XA_NEEDLOCK</td>
-								<td>0x80</td>
-								<td>Need to reclaim locks associated with theis log record during XA prepared xact recovery.</td>
-							</tr>
-							<tr>
-								<td>RAWSTORE</td>
-								<td>0x100</td>
-								<td>A log record generated by the raw store.</td>
-							</tr>
-							<tr>
-								<td>FILE_RESOURCE</td>
-								<td>0x400</td>
-								<td>related to "non-transactional" files.</td>	
-							</tr>
-						</table>
-					</td>
-				</tr>
-				<tr>
-					<td>TransactionId</td>
-					<td>xactId - The Transaction this log belongs to.</td>
-				</tr>
-				<tr>
-					<td>Loggable</td>
-					<td>op - the log operation</td>
-				</tr>
-			</table>
-		</section>
-	</section>
-  </body>
-  <footer> 
-	<legal>
-	</legal>
-  </footer>
-</document>
-
-
+          <li> RawStore is recovered and a checkpoint hasn't been done "for a 
+            while"</li>
+        </ol>
+        <section> 
+          <title>LogCounter</title>
+          <p>Log records are identified using LogCounter, which is an implementation 
+            of LogInstant, a Derby term for LSN. The LogCounter is made up of 
+            the log file number, and the byte offset of the log record within 
+            the log file. Within the stored log record a log counter is represented 
+            as a long. Outside the LogFactory the instant is passed around as 
+            a LogCounter (through its LogInstant interface).</p>
+          <p> The way the long is encoded is such that &lt; == &gt; correctly 
+            tells if one log instant is lessThan, equals or greater than another.</p>
+        </section>
+      </section>
+      <section> 
+        <title> Format of Write Ahead Log </title>
+        <p> An implementation of file based log is in 
+          <code>org.apache.derby.impl.store.raw.log.LogToFile</code>. 
+          This LogFactory is responsible for the formats of 2 kinds of file: 
+          the log file and the log control file. And it is responsible for the 
+          format of the log record wrapper. </p>
+        <section> 
+          <title>Format of Log Control File</title>
+          <p>The log control file contains information about which log files are 
+            present and where the last checkpoint log record is located.</p>
+          <table> 
+            <tr> 
+              <th>Type</th>
+              <th>Desciption</th>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>format id set to FILE_STREAM_LOG_FILE</td>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>obsolete log file version</td>
+            </tr>
+            <tr> 
+              <td>long</td>
+              <td>the log instant (LogCounter) of the last completed checkpoint</td>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>JBMS (older name for Cloudscape/Derby) version</td>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>checkpoint interval</td>
+            </tr>
+            <tr> 
+              <td>long</td>
+              <td>spare (value set to 0)</td>
+            </tr>
+            <tr> 
+              <td>long</td>
+              <td>spare (value set to 0)</td>
+            </tr>
+            <tr> 
+              <td>long</td>
+              <td>spare (value set to 0)</td>
+            </tr>
+          </table>
+        </section>
+        <section> 
+          <title>Format of the log file</title>
+          <p>The log file contains log records which record all the changes to 
+            the database. The complete transaction log is composed of a series 
+            of log files.</p>
+          <table> 
+            <tr> 
+              <th>Type</th>
+              <th>Description</th>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>Format id of this log file, set to FILE_STREAM_LOG_FILE.</td>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>Obsolete log file version - not used</td>
+            </tr>
+            <tr> 
+              <td>long</td>
+              <td>Log file number - this number orders the log files in a series 
+                to form the complete transaction log </td>
+            </tr>
+            <tr> 
+              <td>long</td>
+              <td>PrevLogRecord - log instant of the previous log record, in the 
+                previous log file.</td>
+            </tr>
+            <tr> 
+              <td>[log record wrapper]*</td>
+              <td>one or more log records with wrapper</td>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>EndMarker - value of zero. The beginning of a log record wrapper 
+                is the length of the log record, therefore it is never zero </td>
+            </tr>
+            <tr> 
+              <td>[int fuzzy end]*</td>
+              <td>zero or more int's of value 0, in case this log file has been 
+                recovered and any incomplete log record set to zero. </td>
+            </tr>
+          </table>
+        </section>
+        <section> 
+          <title>Format of the log record wrapper</title>
+          <p>The log record wrapper provides information for the log scan.</p>
+          <table> 
+            <tr> 
+              <th>Type</th>
+              <th>Description</th>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>length - length of the log record (for forward scan)</td>
+            </tr>
+            <tr> 
+              <td>long</td>
+              <td>instant - LogInstant of the log record</td>
+            </tr>
+            <tr> 
+              <td>byte[length]</td>
+              <td>logRecord - byte array that is written by the FileLogger</td>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>length - length of the log record (for backward scan)</td>
+            </tr>
+          </table>
+        </section>
+        <section> 
+          <title>The format of a log record</title>
+          <p>The log record described every change to the persistent store</p>
+          <table> 
+            <tr> 
+              <th>Type</th>
+              <th>Description</th>
+            </tr>
+            <tr> 
+              <td>int</td>
+              <td>format_id, set to LOG_RECORD. The formatId is written by FormatIdOutputStream 
+                when this object is written out by writeObject </td>
+            </tr>
+            <tr> 
+              <td>CompressedInt</td>
+              <td> <p>loggable group - the loggable's group value.</p> <p> Each 
+                  loggable belongs to one or more groups of similar functionality. 
+                </p> <p> Grouping is a way to quickly sort out log records that 
+                  are interesting to different modules or different implementations. 
+                </p> <p> When a module makes loggable and sent it to the log file, 
+                  it must mark this loggable with one or more of the following 
+                  group. If none fit, or if the loggable encompasses functionality 
+                  that is not described in existing groups, then a new group should 
+                  be introduced. </p> <p> Grouping has no effect on how the record 
+                  is logged or how it is treated in rollback or recovery. </p> 
+                <p> The following groups are defined. This list serves as the 
+                  registry of all loggable groups. </p> <table> 
+                  <caption>Loggable Groups</caption>
+                  <tr> 
+                    <th>Name</th>
+                    <th>Value</th>
+                    <th>Description</th>
+                  </tr>
+                  <tr> 
+                    <td>FIRST</td>
+                    <td>0x1</td>
+                    <td>The first operation of a transaction.</td>
+                  </tr>
+                  <tr> 
+                    <td>LAST</td>
+                    <td>0x2</td>
+                    <td>The last operation of a transaction.</td>
+                  </tr>
+                  <tr> 
+                    <td>COMPENSATION</td>
+                    <td>0x4</td>
+                    <td>A compensation log record.</td>
+                  </tr>
+                  <tr> 
+                    <td>BI_LOG</td>
+                    <td>0x8</td>
+                    <td>A BeforeImage log record.</td>
+                  </tr>
+                  <tr> 
+                    <td>COMMIT</td>
+                    <td>0x10</td>
+                    <td>The transaction committed.</td>
+                  </tr>
+                  <tr> 
+                    <td>ABORT</td>
+                    <td>0x20</td>
+                    <td>The transaction aborted.</td>
+                  </tr>
+                  <tr> 
+                    <td>PREPARE</td>
+                    <td>0x40</td>
+                    <td>The transaction prepared.</td>
+                  </tr>
+                  <tr> 
+                    <td>XA_NEEDLOCK</td>
+                    <td>0x80</td>
+                    <td>Need to reclaim locks associated with theis log record 
+                      during XA prepared xact recovery.</td>
+                  </tr>
+                  <tr> 
+                    <td>RAWSTORE</td>
+                    <td>0x100</td>
+                    <td>A log record generated by the raw store.</td>
+                  </tr>
+                  <tr> 
+                    <td>FILE_RESOURCE</td>
+                    <td>0x400</td>
+                    <td>related to "non-transactional" files.</td>
+                  </tr>
+                </table> </td>
+            </tr>
+            <tr> 
+              <td>TransactionId</td>
+              <td>xactId - The Transaction this log belongs to.</td>
+            </tr>
+            <tr> 
+              <td>Loggable</td>
+              <td>op - the log operation</td>
+            </tr>
+          </table>
+        </section>
+      </section>
+      <section>
+         <title>Pointers to relevant classes</title>
+         <fixme author="DM">This section should link to appropriate Javadoc documentation</fixme>
+         <table>
+         <tr>
+             <th>Package</th>
+             <th>Class</th>
+             <th>Description</th>
+         </tr>
+         <tr>
+             <td>org.apache.derby.iapi.store.raw.log</td>
+             <td>LogFactory.java</td>
+             <td>The java interface for logging system module.</td>
+         </tr>
+         <tr>
+             <td>org.apache.derby.impl.store.raw.log</td>
+             <td>LogToFile.java</td>
+             <td>The implmentation of the LogFactory.java, also implementing Module,
+                 this is the one with recovery code.</td>
+         </tr>
+         <tr>
+             <td></td>
+             <td>CheckpointOperation.java</td>
+             <td>A Log Operation that represents a checkpoint.</td>
+         </tr>
+         <tr>
+             <td></td>
+             <td>FileLogger.java</td>
+             <td>Deals with putting log records to disk. Writes log records to a log file as a stream
+                (ie. log records added to the end of the file, no concept of pages).</td>
+         </tr>
+         <tr>
+             <td></td>
+             <td>FlushedScan.java</td>
+             <td>Deals with scanning the log file. Scan the the log which is implemented by a series of log files.
+                 This log scan knows how to move across log file if it is positioned at
+                 the boundary of a log file and needs to getNextRecord.</td>
+         </tr>
+         <tr>
+             <td></td>
+             <td>FlushedScanHandle.java</td>
+             <td>More stuff dealing with scanning the log file.</td>
+         </tr>
+         <tr>
+              <td></td>
+              <td>Scan.java</td>
+              <td>More scan log file stuff. Scan the the log which is implemented by a series of log files.
+                This log scan knows how to move across log file if it is positioned at
+                the boundary of a log file and needs to getNextRecord.</td>
+         </tr>
+         <tr>
+              <td></td>
+              <td>StreamLogScan.java</td>
+              <td>More scan log file stuff. LogScan provides methods to read a log record and get its LogInstant
+                  in an already defined scan.</td>
+         </tr>
+         <tr>
+             <td></td>
+             <td>LogAccessFile.java</td>
+             <td>Lowest level putting log records to disk. Wraps a RandomAccessFile file to provide buffering
+                 on log writes.</td>
+         </tr>
+         <tr>
+             <td></td>
+             <td>LogAccessFileBuffer.java</td>
+             <td>Utility for LogAccessFile. A single buffer of data.</td>
+         </tr>
+         <tr>
+              <td></td>
+              <td>LogCounter.java</td>
+              <td>Log sequence number (LSN) implementation </td>
+         </tr>
+         <tr>
+              <td></td>
+              <td>LogRecord.java</td>
+              <td>The log record written out to disk.</td>
+         </tr>
+         <tr>
+              <td></td>
+              <td>ReadOnly.java</td>
+              <td>an alternate read only implementation of LogFactory</td>
+         </tr>
+         </table>
+      </section>
+    </body>
+    <footer> 
+      <legal></legal>
+    </footer>
+  </document>



Mime
View raw message