subversion-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stef...@apache.org
Subject svn commit: r1628093 - in /subversion/trunk/subversion/libsvn_fs_fs: index.c structure-indexes
Date Sun, 28 Sep 2014 17:56:01 GMT
Author: stefan2
Date: Sun Sep 28 17:56:01 2014
New Revision: 1628093

URL: http://svn.apache.org/r1628093
Log:
Support FSFS format 7 commits in load balanced mixed-architecture clusters.

At least theoretically, machines with different endianess or off_t sizes
might access the same repository on e.g. an iSCSI SAN.  Then the machine
performing a commit may have a different architecture from the one building
up the transaction.  To allow that, even the intermediate (proto-) index
format within those transactions must be platform-independent.

Instead of writing the records as blobs, we store each member individually
as an unsigned 64 bit integer in little endian order.  Limiting all values
to non-negative (with a special handling of SVN_INVALID_REVNUM) makes
conversions from and to the in-memory types well-defined.  We only need
to check for overflows - e.g. in case one machine's apr_off_t is too short.

All functions reading from or writing to the proto index files must use
the new converting reader and writer functions instead of reading plain
blobs themselves.

Finally, update the FSFS structure documentation.

* subversion/libsvn_fs_fs/index.c
  (off_t_max): Define this new APR_OFF_T_MAX surrogate.
  (P2L_PROTO_INDEX_ENTRY_SIZE): Define this record size constant to allow
                                for random file access. Only needed for the
                                P2L proto-index.

  (write_uint64_to_proto_index,
   read_uint64_from_proto_index): New code to read and write individual
                                  numbers to/from the proto-index file in
                                  a platform neutral fixed length format.
  (read_uint32_from_proto_index,
   read_off_t_from_proto_index): Convenience wrappers around the previous.

  (write_entry_to_proto_index): Rename to ...
  (write_l2p_entry_to_proto_index): ... this for clarity.  Use the new
                                    function to write proto index file.
  (read_l2p_entry_from_proto_index): New function to read the proto index.
  (svn_fs_fs__l2p_proto_index_add_revision,
   svn_fs_fs__l2p_proto_index_add_entry): Update callers after rename.
  (svn_fs_fs__l2p_index_append,
   l2p_proto_index_lookup): Call the new reader function instead of reading
                            a plain blob directly from the proto index.

  (svn_fs_fs__p2l_proto_index_add_entry): Write all struct elements
                                          individually, check before
                                          converting them to uint64.
  (read_p2l_entry_from_proto_index): New inverse function to the above.
  (svn_fs_fs__p2l_proto_index_next_offset): Call the new reader function
                                        and update the offset calculation.
  (svn_fs_fs__p2l_index_append): Call the new reader function.

* subversion/libsvn_fs_fs/structure-indexes
  (Design): Add a section about the general proto-index formatting.
  (Log-to-phys index,
   Phys-to-log index): Update the proto-index file format descriptions.

Modified:
    subversion/trunk/subversion/libsvn_fs_fs/index.c
    subversion/trunk/subversion/libsvn_fs_fs/structure-indexes

Modified: subversion/trunk/subversion/libsvn_fs_fs/index.c
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/index.c?rev=1628093&r1=1628092&r2=1628093&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_fs_fs/index.c (original)
+++ subversion/trunk/subversion/libsvn_fs_fs/index.c Sun Sep 28 17:56:01 2014
@@ -43,6 +43,22 @@
 /* maximum length of a uint64 in an 7/8b encoding */
 #define ENCODED_INT_LENGTH 10
 
+/* APR is missing an APR_OFF_T_MAX.  So, define one.  We will use it to
+ * limit file offsets stored in the indexes.
+ *
+ * We assume that everything shorter than 64 bits, it is at least 32 bits.
+ * We also assume that the type is always signed meaning we only have an
+ * effective positive range of 63 or 31 bits, respectively.
+ */
+const apr_uint64_t off_t_max = (sizeof(apr_off_t) == sizeof(apr_int64_t))
+                             ? APR_INT64_MAX
+                             : APR_INT32_MAX;
+
+/* We store P2L proto-index entries as 6 values, 64 bits each on disk.
+ * See also svn_fs_fs__p2l_proto_index_add_entry().
+ */
+#define P2L_PROTO_INDEX_ENTRY_SIZE (6 * sizeof(apr_uint64_t))
+
 /* Page tables in the log-to-phys index file exclusively contain entries
  * of this type to describe position and size of a given page.
  */
@@ -447,22 +463,165 @@ decode_int(apr_uint64_t value)
   return (apr_int64_t)(value % 2 ? -1 - value / 2 : value / 2);
 }
 
+/* Write VALUE to the PROTO_INDEX file, using SCRATCH_POOL for temporary
+ * allocations.
+ *
+ * The point of this function is to ensure an architecture-independent
+ * proto-index file format.  All data is written as unsigned 64 bits ints
+ * in little endian byte order.  64 bits is the largest portable integer
+ * we have and unsigned values have well-defined conversions in C.
+ */
+static svn_error_t *
+write_uint64_to_proto_index(apr_file_t *proto_index,
+                            apr_uint64_t value,
+                            apr_pool_t *scratch_pool)
+{
+  apr_byte_t buffer[sizeof(value)];
+  int i;
+  apr_size_t written;
+
+  /* Split VALUE into 8 bytes using LE ordering. */
+  for (i = 0; i < sizeof(buffer); ++i)
+    {
+      /* Unsigned conversions are well-defined ... */
+      buffer[i] = (apr_byte_t)value;
+      value >>= CHAR_BIT;
+    }
+
+  /* Write it all to disk. */
+  SVN_ERR(svn_io_file_write_full(proto_index, buffer, sizeof(buffer),
+                                 &written, scratch_pool));
+  SVN_ERR_ASSERT(written == sizeof(buffer));
+
+  return SVN_NO_ERROR;
+}
+
+/* Read one unsigned 64 bit value from PROTO_INDEX file and return it in
+ * *VALUE_P.  If EOF is NULL, error out when trying to read beyond EOF.
+ * Use SCRATCH_POOL for temporary allocations.
+ *
+ * This function is the inverse to write_uint64_to_proto_index (see there),
+ * reading the external LE byte order and convert it into host byte order.
+ */
+static svn_error_t *
+read_uint64_from_proto_index(apr_file_t *proto_index,
+                             apr_uint64_t *value_p,
+                             svn_boolean_t *eof,
+                             apr_pool_t *scratch_pool)
+{
+  apr_byte_t buffer[sizeof(*value_p)];
+  apr_size_t read;
+
+  /* Read the full 8 bytes or our 64 bit value, unless we hit EOF.
+   * Assert that we never read partial values. */
+  SVN_ERR(svn_io_file_read_full2(proto_index, buffer, sizeof(buffer),
+                                 &read, eof, scratch_pool));
+  SVN_ERR_ASSERT((eof && *eof) || read == sizeof(buffer));
+
+  /* If we did not hit EOF, reconstruct the uint64 value and return it. */
+  if (!eof || !*eof)
+    {
+      int i;
+      apr_uint64_t value;
+
+      /* This could only overflow if CHAR_BIT had a value that is not
+       * a divisor of 64. */
+      value = 0;
+      for (i = sizeof(buffer) - 1; i >= 0; --i)
+        value = (value << CHAR_BIT) + buffer[i];
+
+      *value_p = value;
+    }
+
+  return SVN_NO_ERROR;
+}
+
+/* Convenience function similar to read_uint64_from_proto_index, but returns
+ * an uint32 value in VALUE_P.  Return an error if the value does not fit.
+ */
+static svn_error_t *
+read_uint32_from_proto_index(apr_file_t *proto_index,
+                             apr_uint32_t *value_p,
+                             svn_boolean_t *eof,
+                             apr_pool_t *scratch_pool)
+{
+  apr_uint64_t value;
+  SVN_ERR(read_uint64_from_proto_index(proto_index, &value, eof,
+                                       scratch_pool));
+  if (value > APR_UINT32_MAX)
+    return svn_error_createf(SVN_ERR_FS_ITEM_INDEX_OVERFLOW, NULL,
+                             _("UINT32 0x%" APR_UINT64_T_HEX_FMT
+                               " too large, max = 0x%" APR_UINT64_T_HEX_FMT),
+                             value, (apr_uint64_t)APR_UINT32_MAX);
+
+  /* This conversion is not lossy because the value can be represented in
+   * the target type. */
+  *value_p = (apr_uint32_t)value;
+
+  return SVN_NO_ERROR;
+}
+
+/* Convenience function similar to read_uint64_from_proto_index, but returns
+ * an off_t value in VALUE_P.  Return an error if the value does not fit.
+ */
+static svn_error_t *
+read_off_t_from_proto_index(apr_file_t *proto_index,
+                            apr_off_t *value_p,
+                            svn_boolean_t *eof,
+                            apr_pool_t *scratch_pool)
+{
+  apr_uint64_t value;
+  SVN_ERR(read_uint64_from_proto_index(proto_index, &value, eof,
+                                       scratch_pool));
+  if (value > off_t_max)
+    return svn_error_createf(SVN_ERR_FS_ITEM_INDEX_OVERFLOW, NULL,
+                             _("File offset 0x%" APR_UINT64_T_HEX_FMT
+                               " too large, max = 0x%" APR_UINT64_T_HEX_FMT),
+                             value, off_t_max);
+
+  /* Shortening conversion from unsigned to signed int is well-defined and
+   * not lossy in C because the value can be represented in the target type.
+   */
+  *value_p = (apr_off_t)value;
+
+  return SVN_NO_ERROR;
+}
+
 /*
  * log-to-phys index
  */
 
-/* Write ENTRY to log-to-phys PROTO_INDEX file and verify the results.
+/* Append ENTRY to log-to-phys PROTO_INDEX file.
  * Use SCRATCH_POOL for temporary allocations.
  */
 static svn_error_t *
-write_entry_to_proto_index(apr_file_t *proto_index,
-                           l2p_proto_entry_t entry,
-                           apr_pool_t *scratch_pool)
+write_l2p_entry_to_proto_index(apr_file_t *proto_index,
+                               l2p_proto_entry_t entry,
+                               apr_pool_t *scratch_pool)
 {
-  apr_size_t written = sizeof(entry);
+  SVN_ERR(write_uint64_to_proto_index(proto_index, entry.offset,
+                                      scratch_pool));
+  SVN_ERR(write_uint64_to_proto_index(proto_index, entry.item_index,
+                                      scratch_pool));
 
-  SVN_ERR(svn_io_file_write(proto_index, &entry, &written, scratch_pool));
-  SVN_ERR_ASSERT(written == sizeof(entry));
+  return SVN_NO_ERROR;
+}
+
+/* Read *ENTRY from log-to-phys PROTO_INDEX file and indicate end-of-file
+ * in *EOF, or error out in that case if EOF is NULL.  *ENTRY is in an
+ * undefined state if an end-of-file occurred.
+ * Use SCRATCH_POOL for temporary allocations.
+ */
+static svn_error_t *
+read_l2p_entry_from_proto_index(apr_file_t *proto_index,
+                                l2p_proto_entry_t *entry,
+                                svn_boolean_t *eof,
+                                apr_pool_t *scratch_pool)
+{
+  SVN_ERR(read_uint64_from_proto_index(proto_index, &entry->offset, eof,
+                                       scratch_pool));
+  SVN_ERR(read_uint64_from_proto_index(proto_index, &entry->item_index, eof,
+                                       scratch_pool));
 
   return SVN_NO_ERROR;
 }
@@ -516,8 +675,8 @@ svn_fs_fs__l2p_proto_index_add_revision(
   entry.offset = 0;
   entry.item_index = 0;
 
-  return svn_error_trace(write_entry_to_proto_index(proto_index, entry,
-                                                    scratch_pool));
+  return svn_error_trace(write_l2p_entry_to_proto_index(proto_index, entry,
+                                                        scratch_pool));
 }
 
 svn_error_t *
@@ -539,8 +698,8 @@ svn_fs_fs__l2p_proto_index_add_entry(apr
   SVN_ERR_ASSERT(item_index < UINT_MAX / 2);
   entry.item_index = item_index;
 
-  return svn_error_trace(write_entry_to_proto_index(proto_index, entry,
-                                                    scratch_pool));
+  return svn_error_trace(write_l2p_entry_to_proto_index(proto_index, entry,
+                                                        scratch_pool));
 }
 
 svn_error_t *
@@ -597,13 +756,10 @@ svn_fs_fs__l2p_index_append(svn_fs_t *fs
   for (entry = 0; !eof; ++entry)
     {
       l2p_proto_entry_t proto_entry;
-      apr_size_t read = 0;
 
       /* (attempt to) read the next entry from the source */
-      SVN_ERR(svn_io_file_read_full2(proto_index,
-                                     &proto_entry, sizeof(proto_entry),
-                                     &read, &eof, local_pool));
-      SVN_ERR_ASSERT(eof || read == sizeof(proto_entry));
+      SVN_ERR(read_l2p_entry_from_proto_index(proto_index, &proto_entry,
+                                              &eof, local_pool));
 
       /* handle new revision */
       if ((entry > 0 && proto_entry.offset == 0) || eof)
@@ -1420,12 +1576,10 @@ l2p_proto_index_lookup(apr_off_t *offset
   while (!eof)
     {
       l2p_proto_entry_t entry;
-      apr_size_t read = 0;
 
       /* (attempt to) read the next entry from the source */
-      SVN_ERR(svn_io_file_read_full2(file, &entry, sizeof(entry),
-                                     &read, &eof, scratch_pool));
-      SVN_ERR_ASSERT(eof || read == sizeof(entry));
+      SVN_ERR(read_l2p_entry_from_proto_index(file, &entry, &eof,
+                                              scratch_pool));
 
       /* handle new revision */
       if (!eof && entry.item_index == item_index)
@@ -1603,11 +1757,91 @@ svn_fs_fs__p2l_proto_index_add_entry(apr
                                      svn_fs_fs__p2l_entry_t *entry,
                                      apr_pool_t *scratch_pool)
 {
-  apr_size_t written = sizeof(*entry);
+  apr_uint64_t revision;
 
-  SVN_ERR(svn_io_file_write_full(proto_index, entry, sizeof(*entry),
-                                 &written, scratch_pool));
-  SVN_ERR_ASSERT(written == sizeof(*entry));
+  /* Make sure all signed elements of ENTRY have non-negative values.
+   *
+   * For file offsets and sizes, this is a given as we use them to describe
+   * absolute positions and sizes.  For revisions, SVN_INVALID_REVNUM is
+   * valid, hence we have to shift it by 1.
+   */
+  SVN_ERR_ASSERT(entry->offset >= 0);
+  SVN_ERR_ASSERT(entry->size >= 0);
+  SVN_ERR_ASSERT(   entry->item.revision >= 0
+                 || entry->item.revision == SVN_INVALID_REVNUM);
+
+  revision = entry->item.revision == SVN_INVALID_REVNUM
+           ? 0
+           : ((apr_uint64_t)entry->item.revision + 1);
+
+  /* Now, all values will nicely convert to uint64. */
+  /* Make sure to keep P2L_PROTO_INDEX_ENTRY_SIZE consistent with this: */
+
+  SVN_ERR(write_uint64_to_proto_index(proto_index, entry->offset,
+                                      scratch_pool));
+  SVN_ERR(write_uint64_to_proto_index(proto_index, entry->size,
+                                      scratch_pool));
+  SVN_ERR(write_uint64_to_proto_index(proto_index, entry->type,
+                                      scratch_pool));
+  SVN_ERR(write_uint64_to_proto_index(proto_index, entry->fnv1_checksum,
+                                      scratch_pool));
+  SVN_ERR(write_uint64_to_proto_index(proto_index, revision,
+                                      scratch_pool));
+  SVN_ERR(write_uint64_to_proto_index(proto_index, entry->item.number,
+                                      scratch_pool));
+
+  return SVN_NO_ERROR;
+}
+
+/* Read *ENTRY from log-to-phys PROTO_INDEX file and indicate end-of-file
+ * in *EOF, or error out in that case if EOF is NULL.  *ENTRY is in an
+ * undefined state if an end-of-file occurred.
+ * Use SCRATCH_POOL for temporary allocations.
+ */
+static svn_error_t *
+read_p2l_entry_from_proto_index(apr_file_t *proto_index,
+                                svn_fs_fs__p2l_entry_t *entry,
+                                svn_boolean_t *eof,
+                                apr_pool_t *scratch_pool)
+{
+  apr_uint64_t revision;
+
+  SVN_ERR(read_off_t_from_proto_index(proto_index, &entry->offset,
+                                      eof, scratch_pool));
+  SVN_ERR(read_off_t_from_proto_index(proto_index, &entry->size,
+                                      eof, scratch_pool));
+  SVN_ERR(read_uint32_from_proto_index(proto_index, &entry->type,
+                                       eof, scratch_pool));
+  SVN_ERR(read_uint32_from_proto_index(proto_index, &entry->fnv1_checksum,
+                                       eof, scratch_pool));
+  SVN_ERR(read_uint64_from_proto_index(proto_index, &revision,
+                                       eof, scratch_pool));
+  SVN_ERR(read_uint64_from_proto_index(proto_index, &entry->item.number,
+                                       eof, scratch_pool));
+
+  /* Do the inverse REVSION number conversion (see
+   * svn_fs_fs__p2l_proto_index_add_entry), if we actually read a complete
+   * record.
+   */
+  if (!eof || !*eof)
+    {
+      /* Be careful with the arithmetics here (overflows and wrap-around): */
+      if (revision > 0 && revision - 1 > LONG_MAX)
+        return svn_error_createf(SVN_ERR_FS_ITEM_INDEX_OVERFLOW, NULL,
+                                _("Revision 0x%" APR_UINT64_T_HEX_FMT
+                                " too large, max = 0x%" APR_UINT64_T_HEX_FMT),
+                                revision, (apr_uint64_t)LONG_MAX);
+
+      /* Shortening conversion from unsigned to signed int is well-defined
+       * and not lossy in C because the value can be represented in the
+       * target type.  Also, cast to 'long' instead of 'svn_revnum_t' here
+       * to provoke a compiler warning if those types should differ and we
+       * would need to change the overflow checking logic.
+       */
+      entry->item.revision = revision == 0
+                           ? SVN_INVALID_REVNUM
+                           : (long)(revision - 1);
+    }
 
   return SVN_NO_ERROR;
 }
@@ -1629,11 +1863,11 @@ svn_fs_fs__p2l_proto_index_next_offset(a
     {
       /* At least one entry.  Read last entry. */
       svn_fs_fs__p2l_entry_t entry;
-      offset -= sizeof(entry);
+      offset -= P2L_PROTO_INDEX_ENTRY_SIZE;
 
       SVN_ERR(svn_io_file_seek(proto_index, APR_SET, &offset, scratch_pool));
-      SVN_ERR(svn_io_file_read_full2(proto_index, &entry, sizeof(entry),
-                                    NULL, NULL, scratch_pool));
+      SVN_ERR(read_p2l_entry_from_proto_index(proto_index, &entry,
+                                              NULL, scratch_pool));
 
       /* Return next offset. */
       *next_offset = entry.offset + entry.size;
@@ -1686,7 +1920,6 @@ svn_fs_fs__p2l_index_append(svn_fs_t *fs
   while (!eof)
     {
       svn_fs_fs__p2l_entry_t entry;
-      apr_size_t read = 0;
       apr_uint64_t entry_end;
       svn_boolean_t new_page = svn_spillbuf__get_size(buffer) == 0;
       apr_uint64_t compound;
@@ -1695,9 +1928,8 @@ svn_fs_fs__p2l_index_append(svn_fs_t *fs
       svn_pool_clear(iterpool);
 
       /* (attempt to) read the next entry from the source */
-      SVN_ERR(svn_io_file_read_full2(proto_index, &entry, sizeof(entry),
-                                     &read, &eof, iterpool));
-      SVN_ERR_ASSERT(eof || read == sizeof(entry));
+      SVN_ERR(read_p2l_entry_from_proto_index(proto_index, &entry,
+                                              &eof, iterpool));
 
       /* "unused" (and usually non-existent) section to cover the offsets
          at the end the of the last page. */

Modified: subversion/trunk/subversion/libsvn_fs_fs/structure-indexes
URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_fs_fs/structure-indexes?rev=1628093&r1=1628092&r2=1628093&view=diff
==============================================================================
--- subversion/trunk/subversion/libsvn_fs_fs/structure-indexes (original)
+++ subversion/trunk/subversion/libsvn_fs_fs/structure-indexes Sun Sep 28 17:56:01 2014
@@ -80,6 +80,14 @@ Most data is unsigned by nature but will
 signed integers.
 
 
+Encoding in proto-index files
+-----------------------------
+
+These have a much simpler encoding.  Throughout the files, all records have
+the same length (but different between L2P and P2L).  All records contain
+unsigned 64 bit integers only, stored in little endian byte order.
+
+
 Log-to-phys index
 =================
 
@@ -190,7 +198,7 @@ at the beginning of the file is optional
   ...
   <eof>         /* end of file. */
 
-All entries are pairs of 64 bit unsigned integers in machine endianess.
+All entries are pairs of 64 bit unsigned integers in little endian order.
 
 
 Phys-to-log index
@@ -293,7 +301,18 @@ Proto index file format
 -----------------------
 
 The index will be created from a proto index file containing simple
-instances of the in-memory representation of svn_fs_fs__p2l_entry_t.
+instances of svn_fs_fs__p2l_entry_t with the following element order:
+
+  item offset               as uint64
+  item size                 as uint64
+  item type                 as uint64
+  modified FNV1a checksum   as uint64
+  revision                  as uint64, with SVN_INVALID_REVNUM mapped to 0
+                                       and revisions >= 0 stored as rev+1
+  item index                as uint64
+
+All values are stored in little endian order.
+
 Page table and header information, except start revision and page size,
 can easily be derived from that information.
 



Mime
View raw message