subversion-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Shahaf <danie...@elego.de>
Subject Re: svn commit: r1241718 - in /subversion/trunk/subversion/libsvn_fs_fs: caching.c fs.h fs_fs.c
Date Sun, 12 Feb 2012 02:42:08 GMT
Stefan Fuhrmann wrote on Sun, Feb 12, 2012 at 03:06:31 +0100:
> On 09.02.2012 16:05, Daniel Shahaf wrote:
> >stefan2@apache.org wrote on Wed, Feb 08, 2012 at 00:44:26 -0000:
> >>Author: stefan2
> >>Date: Wed Feb  8 00:44:26 2012
> >>New Revision: 1241718
> >>
> >>URL: http://svn.apache.org/viewvc?rev=1241718&view=rev
> >>Log:
> >>Major improvement in delta window handling: Cache intermediate
> >>combined delta windows such that changes "close by" won't need
> >>to discover and read the whole chain again.
> >>
> >>For algorithms that traverse history linearly, this optimization
> >>gives delta combination an amortized constant runtime.
> >>
> >>For now, we only cache representations<  100kB. Support for larger
> >>reps can be added later.
> >>
> >>* subversion/libsvn_fs_fs/fs.h
> >>   (fs_fs_data_t): add cache for combined windows
> >>* subversion/libsvn_fs_fs/caching.c
> >>   (svn_fs_fs__initialize_caches): initialize that cache
> >>
> >>* subversion/libsvn_fs_fs/fs_fs.c
> >>   (rep_state): add reference to new cache
> >>   (create_rep_state_body): init that reference
> >>   (rep_read_baton): add reference to cached base window
> >>   (get_cached_combined_window, set_cached_combined_window):
> >>    new utility functions
> >>   (build_rep_list): terminate delta chain early if cached
> >>    base window is available
> >>   (rep_read_get_baton): adapt caller
> >>   (get_combined_window): re-implement
> >>   (get_contents): handle new special case; adapt to
> >>    get_combined_window() signature changes
> >>
> >>Modified:
> >>     subversion/trunk/subversion/libsvn_fs_fs/caching.c
> >>     subversion/trunk/subversion/libsvn_fs_fs/fs.h
> >>     subversion/trunk/subversion/libsvn_fs_fs/fs_fs.c
> >>
> >I haven't reviewed this, but a question:
> >
> >>+/* Read the WINDOW_P for the rep state RS from the current FSFS session's
> >>+ * cache. This will be a no-op and IS_CACHED will be set to FALSE if no
> >>+ * cache has been given. If a cache is available IS_CACHED will inform
> >>+ * the caller about the success of the lookup. Allocations (of the window
> >>+ * in particualar) will be made from POOL.
> >>+ */
> >>+static svn_error_t *
> >>+get_cached_combined_window(svn_stringbuf_t **window_p,
> >>+                           struct rep_state *rs,
> >>+                           svn_boolean_t *is_cached,
> >>+                           apr_pool_t *pool)
> >>+{
> >>+  if (! rs->combined_cache)
> >>+    {
> >>+      /* txdelta window has not been enabled */
> >>+      *is_cached = FALSE;
> >>+    }
> >>+  else
> >>+    {
> >>+      /* ask the cache for the desired txdelta window */
> >>+      return svn_cache__get((void **)window_p,
> >>+                            is_cached,
> >>+                            rs->combined_cache,
> >>+                            get_window_key(rs, rs->start, pool),
> >How does the cache key identify the particular combined window being
> >cached?
> 
> Undeltified windows use the same key as their deltified
> representation; basically revision file + offset. The distinction
> between deltified and un-deltified is made by the cache
> instance prefix.

What revision file and what offset, and how do they related to the
window object contained in the cache?

(I'm going to guess that the key is the rev-file/offset of a rep that
generates the same fulltext as the cached window; but I shouldn't have
to guess.)

> >get_window_key() may return "".  If it returns "" when called as an
> >argument to svn_cache__set() and then also here, won't the cache return
> >a wrong result?
> >
> There is a comment in get_window_key() for this case.
> "" will only be returned if we can't get the name of the
> open APR file. This is virtually impossible. If it happens
> anyways, it will hit the deltified window cache first, we will
> report a repository corruption.
> 

In plain English: there is an unlikely, but not impossible, scenario
where the only thing between your new code and silent corruption
(specifically: incorrect retrieval of a fulltext) is the order of
lookups in two different caches.

That sounds awfully brittle to me, and the sensitivity of the lookup
order is not documented anywhere.

> But maybe we should change the cache API definition
> to support and reject NULL keys. get_window_key() could
> then return simply NULL and could do with fewer assumptions.
> 

What does "support and reject" mean?  That trying to get(key=NULL)
always returns "not found" and trying to set(key=NULL) doesn't change
the cache's state?

> -- Stefan^2.
> 

Mime
View raw message