Return-Path: Delivered-To: apmail-subversion-dev-archive@minotaur.apache.org Received: (qmail 8998 invoked from network); 7 Jan 2011 14:51:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Jan 2011 14:51:17 -0000 Received: (qmail 14186 invoked by uid 500); 7 Jan 2011 14:51:16 -0000 Delivered-To: apmail-subversion-dev-archive@subversion.apache.org Received: (qmail 14103 invoked by uid 500); 7 Jan 2011 14:51:16 -0000 Mailing-List: contact dev-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@subversion.apache.org Received: (qmail 14095 invoked by uid 99); 7 Jan 2011 14:51:16 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Jan 2011 14:51:16 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [74.125.82.47] (HELO mail-ww0-f47.google.com) (74.125.82.47) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Jan 2011 14:51:08 +0000 Received: by wwb39 with SMTP id 39so20474988wwb.16 for ; Fri, 07 Jan 2011 06:50:44 -0800 (PST) Received: by 10.227.143.79 with SMTP id t15mr5165585wbu.95.1294411844793; Fri, 07 Jan 2011 06:50:44 -0800 (PST) Received: from [192.168.1.3] (120.20.169.217.in-addr.arpa [217.169.20.120]) by mx.google.com with ESMTPS id t5sm12456298wes.33.2011.01.07.06.50.43 (version=SSLv3 cipher=RC4-MD5); Fri, 07 Jan 2011 06:50:44 -0800 (PST) Subject: [PATCH in progress] Ref-counting for pristine texts From: Julian Foad To: dev@subversion.apache.org Content-Type: multipart/mixed; boundary="=-NvzB4fyMg6xuWYSU/Sw7" Date: Fri, 07 Jan 2011 14:50:42 +0000 Message-ID: <1294411842.18270.644.camel@edith> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 --=-NvzB4fyMg6xuWYSU/Sw7 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit I'm implementing ref-counting for pristine texts, with the aim of deleting any text whose ref-count reaches zero. The current situation without this work is that many pristine texts are not deleted when they become unreferenced, and they accumulate in the pristine store until the user runs "svn cleanup". I think that is not good enough even for an initial release. Having discarded other approaches, my plan now is to: * Maintain the ref count at the granularity of SQL statements: each time we add or remove a reference in the DB (which is usually in the NODES table), update the corresponding ref count in the PRISTINE table. This could be changed later to once per SQLite transaction if we find cases where that would make a significant saving. * Act on the ref count (deleting texts whose count is zero) at some higher level: when closing a "wcroot" object or after running the Work Queue or just before returning from a libsvn_wc public API. For the time being I'm doing it when closing a "wcroot" object. The best way I have found to maintain the ref counts is using SQL triggers to update the count when a reference is added or deleted. CREATE TRIGGER nodes_insert_trigger AFTER INSERT ON nodes BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; END; CREATE TRIGGER nodes_delete_trigger AFTER DELETE ON nodes BEGIN UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; END; CREATE TRIGGER nodes_update_checksum_trigger AFTER UPDATE OF checksum ON nodes BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; END; (A detail: This correctly handles NULL in the checksum column: such a value won't match any row in the 'pristine' table.) The only case that SQLite doesn't handle automatically is the replacement part of "INSERT OR REPLACE INTO ...": it doesn't fire the "delete" trigger in that case. To overcome this limitation, we must instead use an explicit "DELETE FROM ..." statement followed by an unconditional "INSERT INTO ..." statement. I'm puzzled about why some of our INSERT statements should ever need to overwrite an existing row, but they currently do, and I am marking such places with '###' comments. In order to check whether this is all working, I have inserted in close_wcroot() a call to svn_wc__db_pristine_cleanup_wcroot() and made the latter function count (in debug mode only) the number of actual references versus the recorded refcount of each pristine, and complain about any difference. Using this a few tests in the test suite fail, mostly copy/revert/upgrade tests, so I can see where I have a few more cases to complete. I expect all of these to be cases where I need to put an explicit "DELETE" statement instead of "INSERT OR REPLACE". This checking code prints a report such as the following when there is a mismatch: DBG: wc_db.c:2692: unused/miscounted pristines in '/home/julianfoad/src/subversion-n-gamma/subversion/tests/cmdline/svn-test-work/working_copies/copy_tests-30' DBG: wc_db.c:2697: 56388a031dffbf9df7c32e1f299b1d5d7ef60881: refcount 2 != actual_refs 1 DBG: wc_db.c:2717: pristine ref: 56388a031dffbf9df7c32e1f299b1d5d7ef60881, op_depth 0, 'A/D/G/rho' Attached are two patches, including log messages. The one called "pristine-refcount-triggers-1.patch" implements the trigger approach, as described above. The one called "pristine-refcount-manual-1.patch" is an earlier patch, in which I started adding explicit SQL queries to increment and decrement the ref count whenever we update the NODES table. Both of them include the same checking/debugging/cleanup code. Any thoughts on the direction of the whole thing, or the details? - Julian --=-NvzB4fyMg6xuWYSU/Sw7 Content-Disposition: attachment; filename="pristine-refcount-manual-1.patch" Content-Type: text/x-patch; name="pristine-refcount-manual-1.patch"; charset="UTF-8" Content-Transfer-Encoding: 7bit ### This patch is experimental and incomplete. Implement reference counting for pristine texts. This part of the patch only causes the reference count to be incremented and decremented. * subversion/libsvn_wc/wc_db.c (pristine_refcount_increment, pristine_refcount_decrement, pristine_refcount_decrement_by_nodes_row): New functions. ### pristine_refcount_decrement() is not used. (insert_base_node, insert_incomplete_children, insert_working_node, commit_node, set_new_dir_to_incomplete_txn): Decrement the ref count associated with any pristine text in a row that is about to be overwritten, and increment the ref count associated with the new row afterwards. * subversion/libsvn_wc/wc-queries.sql (STMT_PRISTINE_REFCOUNT_DECREMENT_BY_NODES_ROW, STMT_PRISTINE_REFCOUNT_INCREMENT): New queries. (STMT_INSERT_PRISTINE): Initialize the ref count to 0 instead of 1. When closing a "wcroot" object, this patch also checks the ref counts of all pristine texts, and removes all unreferenced pristine texts. This part of the patch is included only for testing purposes. * subversion/libsvn_wc/wc_db.c (count_pristine_references): New function. (pristine_cleanup_wcroot): Rename to svn_wc__db_pristine_cleanup_wcroot(). Check that the ref count column of each pristine text matches the actual number of references. Show debug messages when deleting unused pristine texts. (svn_wc__db_pristine_cleanup): Adjust for the renaming. * subversion/libsvn_wc/wc_db_pdh.c (close_wcroot): Clean up unreferenced pristines. * subversion/libsvn_wc/wc_db_private.h (svn_wc__db_pristine_cleanup_wcroot): New function. * subversion/libsvn_wc/wc-queries.sql (STMT_SELECT_ALL_PRISTINES, STMT_SELECT_ALL_PRISTINE_REFERENCES): New queries. --This line, and those below, will be ignored-- Index: subversion/libsvn_wc/wc_db.c =================================================================== --- subversion/libsvn_wc/wc_db.c (revision 1056311) +++ subversion/libsvn_wc/wc_db.c (working copy) @@ -816,6 +816,48 @@ retract_parent_delete(svn_sqlite__db_t * } +/* Increment the ref-count of the pristine text identified by SHA1_CHECKSUM. */ +static svn_error_t * +pristine_refcount_increment(svn_sqlite__db_t *sdb, + const svn_checksum_t *sha1_checksum, + apr_pool_t *scratch_pool) +{ + svn_sqlite__stmt_t *stmt; + + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, + STMT_PRISTINE_REFCOUNT_INCREMENT)); + SVN_ERR(svn_sqlite__bind_checksum(stmt, 1, sha1_checksum, scratch_pool)); + SVN_ERR(svn_sqlite__update(NULL, stmt)); + + return SVN_NO_ERROR; +} + +/* Decrement the ref-count of the pristine text identified by SHA1_CHECKSUM. */ +static svn_error_t * +pristine_refcount_decrement(const svn_checksum_t *sha1_checksum, + apr_pool_t *scratch_pool) +{ + return SVN_NO_ERROR; +} + +/* Decrement the ref-count of the pristine text whose checksum is given in + * the NODES table row indexed by (WC_ID, LOCAL_RELPATH, OP_DEPTH). If + * there is no such row in NODES, or its 'checksum' is null, do nothing. */ +static svn_error_t * +pristine_refcount_decrement_by_nodes_row(svn_sqlite__db_t *sdb, + apr_int64_t wc_id, + const char *local_relpath, + apr_int64_t op_depth) +{ + svn_sqlite__stmt_t *stmt; + + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, + STMT_PRISTINE_REFCOUNT_DECREMENT_BY_NODES_ROW)); + SVN_ERR(svn_sqlite__bindf(stmt, "isi", wc_id, local_relpath, op_depth)); + SVN_ERR(svn_sqlite__update(NULL, stmt)); + + return SVN_NO_ERROR; +} /* */ static svn_error_t * @@ -835,6 +899,10 @@ insert_base_node(void *baton, svn_sqlite /* ### we can't handle this right now */ SVN_ERR_ASSERT(pibb->conflict == NULL); + SVN_ERR(pristine_refcount_decrement_by_nodes_row( + sdb, pibb->wc_id, pibb->local_relpath, + (apr_int64_t)0 /* op_depth for base */)); + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_INSERT_NODE)); SVN_ERR(svn_sqlite__bindf(stmt, "isisisr" "tstr" /* 8 - 11 */ @@ -856,11 +930,18 @@ insert_base_node(void *baton, svn_sqlite (pibb->kind == svn_wc__db_kind_symlink) ? pibb->target : NULL)); /* 19 */ - if (pibb->kind == svn_wc__db_kind_file) { - SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, pibb->checksum, scratch_pool)); - if (pibb->translated_size != SVN_INVALID_FILESIZE) - SVN_ERR(svn_sqlite__bind_int64(stmt, 16, pibb->translated_size)); - } + if (pibb->kind == svn_wc__db_kind_file) + { + if (pibb->checksum) + { + SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, pibb->checksum, + scratch_pool)); + SVN_ERR(pristine_refcount_increment(sdb, pibb->checksum, + scratch_pool)); + } + if (pibb->translated_size != SVN_INVALID_FILESIZE) + SVN_ERR(svn_sqlite__bind_int64(stmt, 16, pibb->translated_size)); + } SVN_ERR(svn_sqlite__bind_properties(stmt, 15, pibb->props, scratch_pool)); @@ -973,6 +1064,9 @@ insert_working_node(void *baton, SVN_ERR_ASSERT(*piwb->local_relpath != '\0'); parent_relpath = svn_relpath_dirname(piwb->local_relpath, scratch_pool); + SVN_ERR(pristine_refcount_decrement_by_nodes_row( + sdb, piwb->wc_id, piwb->local_relpath, piwb->op_depth)); + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_INSERT_NODE)); SVN_ERR(svn_sqlite__bindf(stmt, "isisnnntstrisn" "nnnn" /* properties translated_size last_mod_time dav_cache */ @@ -993,8 +1093,13 @@ insert_working_node(void *baton, if (piwb->kind == svn_wc__db_kind_file) { - SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, piwb->checksum, - scratch_pool)); + if (piwb->checksum) + { + SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, piwb->checksum, + scratch_pool)); + SVN_ERR(pristine_refcount_increment(sdb, piwb->checksum, + scratch_pool)); + } } else if (piwb->kind == svn_wc__db_kind_symlink) { @@ -2584,12 +2689,105 @@ svn_wc__db_pristine_remove(svn_wc__db_t } +#ifdef SVN_DEBUG +/* Set *ACTUAL_REFS to the number of references to SHA1_CHECKSUM. */ static svn_error_t * -pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot, - apr_pool_t *scratch_pool) +count_pristine_references(int *actual_refs, + svn_sqlite__db_t *sdb, + const svn_checksum_t *sha1_checksum, + apr_pool_t *scratch_pool) { svn_sqlite__stmt_t *stmt; + *actual_refs = 0; + + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, + STMT_SELECT_ALL_PRISTINE_REFERENCES)); + SVN_ERR(svn_sqlite__bind_checksum(stmt, 1, sha1_checksum, scratch_pool)); + while (1) + { + svn_boolean_t have_row; + + SVN_ERR(svn_sqlite__step(&have_row, stmt)); + if (! have_row) + break; + + (*actual_refs)++; + } + SVN_ERR(svn_sqlite__reset(stmt)); + return SVN_NO_ERROR; +} +#endif + +svn_error_t * +svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot, + apr_pool_t *scratch_pool) +{ + svn_sqlite__stmt_t *stmt; + svn_boolean_t done_one = FALSE; + +#ifdef SVN_DEBUG + /* For each pristine text: check its refcount matches its actual references. */ + SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb, + STMT_SELECT_ALL_PRISTINES)); + while (1) + { + svn_boolean_t have_row; + const svn_checksum_t *sha1_checksum; + int refcount, actual_refs; + svn_sqlite__stmt_t *stmt2; + + SVN_ERR(svn_sqlite__step(&have_row, stmt)); + if (! have_row) + break; + + SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0, + scratch_pool)); + refcount = svn_sqlite__column_int(stmt, 1); + + /* Count the actual references to this checksum. */ + SVN_ERR(count_pristine_references(&actual_refs, wcroot->sdb, + sha1_checksum, scratch_pool)); + + if (refcount == actual_refs) + continue; + + if (! done_one) + { + SVN_DBG(("unused/miscounted pristines in '%s'\n", wcroot->abspath)); + done_one = TRUE; + } + SVN_DBG(("%s: refcount %d != actual_refs %d\n", + svn_checksum_to_cstring_display(sha1_checksum, scratch_pool), + refcount, actual_refs)); + + /* Show the actual references to this checksum. */ + SVN_ERR(svn_sqlite__get_statement(&stmt2, wcroot->sdb, + STMT_SELECT_ALL_PRISTINE_REFERENCES)); + SVN_ERR(svn_sqlite__bind_checksum(stmt2, 1, sha1_checksum, scratch_pool)); + while (1) + { + svn_boolean_t have_row2; + const char *local_relpath; + int op_depth; + + SVN_ERR(svn_sqlite__step(&have_row2, stmt2)); + if (! have_row2) + break; + + local_relpath = svn_sqlite__column_text(stmt2, 1, scratch_pool); + op_depth = svn_sqlite__column_int(stmt2, 2); + SVN_DBG(("pristine ref: %s, op_depth %d, '%s'\n", + svn_checksum_to_cstring_display(sha1_checksum, scratch_pool), + op_depth, local_relpath)); + } + SVN_ERR(svn_sqlite__reset(stmt2)); + + SVN_ERR_ASSERT(!"pristine refcount mismatch"); + } + SVN_ERR(svn_sqlite__reset(stmt)); +#endif + /* Find each unreferenced pristine in the DB and remove it. */ SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb, STMT_SELECT_UNREFERENCED_PRISTINES)); @@ -2604,6 +2802,13 @@ pristine_cleanup_wcroot(svn_wc__db_wcroo SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0, scratch_pool)); + if (! done_one) + { + SVN_DBG(("removing unused pristines in '%s'\n", wcroot->abspath)); + done_one = TRUE; + } + SVN_DBG(("removing unused pristine %s\n", + svn_checksum_to_cstring_display(sha1_checksum, scratch_pool))); SVN_ERR(pristine_remove(wcroot, sha1_checksum, scratch_pool)); } SVN_ERR(svn_sqlite__reset(stmt)); @@ -2627,7 +2832,7 @@ svn_wc__db_pristine_cleanup(svn_wc__db_t scratch_pool, scratch_pool)); VERIFY_USABLE_PDH(pdh); - SVN_ERR(pristine_cleanup_wcroot(pdh->wcroot, scratch_pool)); + SVN_ERR(svn_wc__db_pristine_cleanup_wcroot(pdh->wcroot, scratch_pool)); return SVN_NO_ERROR; } @@ -6007,6 +6212,10 @@ commit_node(void *baton, svn_sqlite__db_ /* ### other presences? or reserve that for separate functions? */ new_presence = svn_wc__db_status_normal; + SVN_ERR(pristine_refcount_decrement_by_nodes_row( + cb->pdh->wcroot->sdb, cb->pdh->wcroot->wc_id, cb->local_relpath, + (apr_int64_t)0 /* op_depth for base */)); + SVN_ERR(svn_sqlite__get_statement(&stmt, cb->pdh->wcroot->sdb, STMT_APPLY_CHANGES_TO_BASE_NODE)); /* symlink_target not yet used */ @@ -6026,6 +6243,9 @@ commit_node(void *baton, svn_sqlite__db_ SVN_ERR(svn_sqlite__bind_checksum(stmt, 13, cb->new_checksum, scratch_pool)); + if (cb->new_checksum) + SVN_ERR(pristine_refcount_increment(cb->pdh->wcroot->sdb, cb->new_checksum, + scratch_pool)); SVN_ERR(svn_sqlite__bind_properties(stmt, 15, cb->new_dav_cache, scratch_pool)); Index: subversion/libsvn_wc/wc_db_pdh.c =================================================================== --- subversion/libsvn_wc/wc_db_pdh.c (revision 1056311) +++ subversion/libsvn_wc/wc_db_pdh.c (working copy) @@ -26,6 +26,7 @@ #include #include "svn_dirent_uri.h" +#include "svn_pools.h" #include "wc.h" #include "adm_files.h" @@ -114,6 +115,14 @@ close_wcroot(void *data) SVN_ERR_ASSERT_NO_RETURN(wcroot->sdb != NULL); +#ifdef SVN_WC__PRISTINE_CLEANUP + { + apr_pool_t *scratch_pool = svn_pool_create(NULL); + svn_error_clear(svn_wc__db_pristine_cleanup_wcroot(wcroot, scratch_pool)); + svn_pool_destroy(scratch_pool); + } +#endif + err = svn_sqlite__close(wcroot->sdb); wcroot->sdb = NULL; if (err) Index: subversion/libsvn_wc/wc_db_private.h =================================================================== --- subversion/libsvn_wc/wc_db_private.h (revision 1056311) +++ subversion/libsvn_wc/wc_db_private.h (working copy) @@ -194,5 +194,10 @@ svn_wc__db_util_open_db(svn_sqlite__db_t apr_pool_t *result_pool, apr_pool_t *scratch_pool); +/* Remove all unreferenced pristines in WCROOT. */ +svn_error_t * +svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot, + apr_pool_t *scratch_pool); + #endif /* WC_DB_PDH_H */ Index: subversion/libsvn_wc/wc-queries.sql =================================================================== --- subversion/libsvn_wc/wc-queries.sql (revision 1056311) +++ subversion/libsvn_wc/wc-queries.sql (working copy) @@ -132,4 +132,16 @@ SELECT id FROM repository WHERE root = ? -- STMT_INSERT_REPOSITORY INSERT INTO repository (root, uuid) VALUES (?1, ?2); +-- STMT_PRISTINE_REFCOUNT_DECREMENT_BY_NODES_ROW +/* Decrement the ref-count of the pristine text whose checksum is given in + * the NODES table row indexed by (wc_id, local_relpath, op_depth). If + * there is no such row in NODES, or its 'checksum' is null, do nothing. */ +UPDATE pristine SET refcount = refcount - 1 +WHERE checksum = (SELECT checksum FROM nodes + WHERE wc_id = ?1 AND local_relpath = ?2 AND op_depth = ?3); + +-- STMT_PRISTINE_REFCOUNT_INCREMENT +UPDATE pristine SET refcount = refcount + 1 +WHERE checksum = ?1; + -- STMT_INSERT_NODE @@ -411,7 +427,24 @@ DELETE FROM work_queue WHERE id = ?1; -- STMT_INSERT_PRISTINE INSERT OR IGNORE INTO pristine (checksum, md5_checksum, size, refcount) -VALUES (?1, ?2, ?3, 1); +VALUES (?1, ?2, ?3, 0); + +-- STMT_SELECT_ALL_PRISTINES +SELECT checksum, refcount +FROM pristine; + +-- STMT_SELECT_ALL_PRISTINE_REFERENCES +SELECT checksum, local_relpath, op_depth FROM nodes + WHERE checksum = ?1 /*OR checksum = ?2*/ +UNION ALL +SELECT older_checksum, local_relpath, -1 FROM actual_node + WHERE older_checksum = ?1 /*OR older_checksum = ?2*/ +UNION ALL +SELECT left_checksum, local_relpath, -2 FROM actual_node + WHERE left_checksum = ?1 /*OR left_checksum = ?2*/ +UNION ALL +SELECT right_checksum, local_relpath, -3 FROM actual_node + WHERE right_checksum = ?1 /*OR right_checksum = ?2*/ -- STMT_SELECT_PRISTINE_MD5_CHECKSUM SELECT md5_checksum --=-NvzB4fyMg6xuWYSU/Sw7 Content-Disposition: attachment; filename="pristine-refcount-triggers-1.patch" Content-Type: text/x-patch; name="pristine-refcount-triggers-1.patch"; charset="UTF-8" Content-Transfer-Encoding: 7bit ### This patch is experimental and incomplete. Implement reference counting for pristine texts, using SQLite triggers. This part of the patch only causes the reference count to be incremented and decremented. * subversion/libsvn_wc/wc_db.c (delete_nodes_row): New function. (pristine_refcount_increment, pristine_refcount_decrement, pristine_refcount_decrement_by_nodes_row): New functions, inactive. For experimenting with non-automatic reference counting, as an alternative to the SQLite triggers method that is currently active in this patch. (insert_base_node, insert_incomplete_children, insert_working_node, commit_node, set_new_dir_to_incomplete_txn): Delete any existing row, thus decrementing the pristine refcount if applicable, before inserting the new one. * subversion/libsvn_wc/wc-metadata.sql (nodes_insert_trigger, nodes_delete_trigger, nodes_update_checksum_trigger): New triggers. * subversion/libsvn_wc/wc-queries.sql (STMT_INSERT_PRISTINE): Initialize the ref count to 0 instead of 1. (STMT_DELETE_NODES_ROW): New query. (STMT_INSERT_NODE, STMT_APPLY_CHANGES_TO_BASE_NODE): Use just "INSERT" instead of "INSERT OR REPLACE". When closing a "wcroot" object, this patch also checks the ref counts of all pristine texts, and removes all unreferenced pristine texts. This part of the patch is included only for testing purposes. * subversion/libsvn_wc/wc_db.c (count_pristine_references): New function. (pristine_cleanup_wcroot): Rename to svn_wc__db_pristine_cleanup_wcroot(). Check that the ref count column of each pristine text matches the actual number of references. Show debug messages when deleting unused pristine texts. (svn_wc__db_pristine_cleanup): Adjust for the renaming. * subversion/libsvn_wc/wc_db_pdh.c (close_wcroot): Clean up unreferenced pristines. * subversion/libsvn_wc/wc_db_private.h (svn_wc__db_pristine_cleanup_wcroot): New function. * subversion/libsvn_wc/wc-queries.sql (STMT_SELECT_ALL_PRISTINES, STMT_SELECT_ALL_PRISTINE_REFERENCES): New queries. --This line, and those below, will be ignored-- Index: subversion/libsvn_wc/wc_db.c =================================================================== --- subversion/libsvn_wc/wc_db.c (revision 1056311) +++ subversion/libsvn_wc/wc_db.c (working copy) @@ -816,6 +816,23 @@ retract_parent_delete(svn_sqlite__db_t * } +/* Delete the row indexed by (WC_ID, LOCAL_RELPATH, OP_DEPTH) from the NODES + * table in SDB. If there is no such row in NODES, do nothing. */ +static svn_error_t * +delete_nodes_row(svn_sqlite__db_t *sdb, + apr_int64_t wc_id, + const char *local_relpath, + apr_int64_t op_depth) +{ + svn_sqlite__stmt_t *stmt; + + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_DELETE_NODES_ROW)); + SVN_ERR(svn_sqlite__bindf(stmt, "isi", wc_id, local_relpath, op_depth)); + SVN_ERR(svn_sqlite__step_done(stmt)); + + return SVN_NO_ERROR; +} + /* */ static svn_error_t * @@ -835,6 +852,12 @@ insert_base_node(void *baton, svn_sqlite /* ### we can't handle this right now */ SVN_ERR_ASSERT(pibb->conflict == NULL); + /* First, delete any existing row. We do this explicitly because if we + * were to use an "INSERT OR REPLACE" statement SQLite would not fire the + * triggers for the deletion part of the replacement. */ + SVN_ERR(delete_nodes_row(sdb, pibb->wc_id, pibb->local_relpath, + (apr_int64_t)0 /* op_depth for base */)); + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_INSERT_NODE)); SVN_ERR(svn_sqlite__bindf(stmt, "isisisr" "tstr" /* 8 - 11 */ @@ -856,11 +879,12 @@ insert_base_node(void *baton, svn_sqlite (pibb->kind == svn_wc__db_kind_symlink) ? pibb->target : NULL)); /* 19 */ - if (pibb->kind == svn_wc__db_kind_file) { - SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, pibb->checksum, scratch_pool)); - if (pibb->translated_size != SVN_INVALID_FILESIZE) - SVN_ERR(svn_sqlite__bind_int64(stmt, 16, pibb->translated_size)); - } + if (pibb->kind == svn_wc__db_kind_file) + { + SVN_ERR(svn_sqlite__bind_checksum(stmt, 14, pibb->checksum, scratch_pool)); + if (pibb->translated_size != SVN_INVALID_FILESIZE) + SVN_ERR(svn_sqlite__bind_int64(stmt, 16, pibb->translated_size)); + } SVN_ERR(svn_sqlite__bind_properties(stmt, 15, pibb->props, scratch_pool)); @@ -933,6 +957,16 @@ insert_incomplete_children(svn_sqlite__d { const char *name = APR_ARRAY_IDX(children, i, const char *); + /* First, delete any existing row. We do this explicitly because if we + * were to use an "INSERT OR REPLACE" statement SQLite would not fire the + * triggers for the deletion part of the replacement. */ + /* ### Why should we ever need to overwrite an existing row here? + * This happens in op-depth-test 14,15 for example. */ + SVN_ERR(delete_nodes_row(sdb, wc_id, + svn_relpath_join(local_relpath, name, + scratch_pool), + op_depth)); + SVN_ERR(svn_sqlite__bindf(stmt, "isisnnrsns", wc_id, svn_relpath_join(local_relpath, name, @@ -973,6 +1007,12 @@ insert_working_node(void *baton, SVN_ERR_ASSERT(*piwb->local_relpath != '\0'); parent_relpath = svn_relpath_dirname(piwb->local_relpath, scratch_pool); + /* First, delete any existing row. We do this explicitly because if we + * were to use an "INSERT OR REPLACE" statement SQLite would not fire the + * triggers for the deletion part of the replacement. */ + SVN_ERR(delete_nodes_row(sdb, piwb->wc_id, piwb->local_relpath, + piwb->op_depth)); + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, STMT_INSERT_NODE)); SVN_ERR(svn_sqlite__bindf(stmt, "isisnnntstrisn" "nnnn" /* properties translated_size last_mod_time dav_cache */ @@ -2584,11 +2624,104 @@ svn_wc__db_pristine_remove(svn_wc__db_t } +#ifdef SVN_DEBUG +/* Set *ACTUAL_REFS to the number of references to SHA1_CHECKSUM. */ static svn_error_t * -pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot, - apr_pool_t *scratch_pool) +count_pristine_references(int *actual_refs, + svn_sqlite__db_t *sdb, + const svn_checksum_t *sha1_checksum, + apr_pool_t *scratch_pool) +{ + svn_sqlite__stmt_t *stmt; + + *actual_refs = 0; + + SVN_ERR(svn_sqlite__get_statement(&stmt, sdb, + STMT_SELECT_ALL_PRISTINE_REFERENCES)); + SVN_ERR(svn_sqlite__bind_checksum(stmt, 1, sha1_checksum, scratch_pool)); + while (1) + { + svn_boolean_t have_row; + + SVN_ERR(svn_sqlite__step(&have_row, stmt)); + if (! have_row) + break; + + (*actual_refs)++; + } + SVN_ERR(svn_sqlite__reset(stmt)); + return SVN_NO_ERROR; +} +#endif + +svn_error_t * +svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot, + apr_pool_t *scratch_pool) { svn_sqlite__stmt_t *stmt; + svn_boolean_t done_one = FALSE; + +#ifdef SVN_DEBUG + /* For each pristine text: check its refcount matches its actual references. */ + SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb, + STMT_SELECT_ALL_PRISTINES)); + while (1) + { + svn_boolean_t have_row; + const svn_checksum_t *sha1_checksum; + int refcount, actual_refs; + svn_sqlite__stmt_t *stmt2; + + SVN_ERR(svn_sqlite__step(&have_row, stmt)); + if (! have_row) + break; + + SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0, + scratch_pool)); + refcount = svn_sqlite__column_int(stmt, 1); + + /* Count the actual references to this checksum. */ + SVN_ERR(count_pristine_references(&actual_refs, wcroot->sdb, + sha1_checksum, scratch_pool)); + + if (refcount == actual_refs) + continue; + + if (! done_one) + { + SVN_DBG(("unused/miscounted pristines in '%s'\n", wcroot->abspath)); + done_one = TRUE; + } + SVN_DBG(("%s: refcount %d != actual_refs %d\n", + svn_checksum_to_cstring_display(sha1_checksum, scratch_pool), + refcount, actual_refs)); + + /* Show the actual references to this checksum. */ + SVN_ERR(svn_sqlite__get_statement(&stmt2, wcroot->sdb, + STMT_SELECT_ALL_PRISTINE_REFERENCES)); + SVN_ERR(svn_sqlite__bind_checksum(stmt2, 1, sha1_checksum, scratch_pool)); + while (1) + { + svn_boolean_t have_row2; + const char *local_relpath; + int op_depth; + + SVN_ERR(svn_sqlite__step(&have_row2, stmt2)); + if (! have_row2) + break; + + local_relpath = svn_sqlite__column_text(stmt2, 1, scratch_pool); + op_depth = svn_sqlite__column_int(stmt2, 2); + SVN_DBG(("pristine ref: %s, op_depth %d, '%s'\n", + svn_checksum_to_cstring_display(sha1_checksum, scratch_pool), + op_depth, local_relpath)); + } + SVN_ERR(svn_sqlite__reset(stmt2)); + + SVN_ERR_ASSERT(!"pristine refcount mismatch"); + } + SVN_ERR(svn_sqlite__reset(stmt)); +#endif /* Find each unreferenced pristine in the DB and remove it. */ SVN_ERR(svn_sqlite__get_statement(&stmt, wcroot->sdb, @@ -2604,6 +2737,13 @@ pristine_cleanup_wcroot(svn_wc__db_wcroo SVN_ERR(svn_sqlite__column_checksum(&sha1_checksum, stmt, 0, scratch_pool)); + if (! done_one) + { + SVN_DBG(("removing unused pristines in '%s'\n", wcroot->abspath)); + done_one = TRUE; + } + SVN_DBG(("removing unused pristine %s\n", + svn_checksum_to_cstring_display(sha1_checksum, scratch_pool))); SVN_ERR(pristine_remove(wcroot, sha1_checksum, scratch_pool)); } SVN_ERR(svn_sqlite__reset(stmt)); @@ -2627,7 +2767,7 @@ svn_wc__db_pristine_cleanup(svn_wc__db_t scratch_pool, scratch_pool)); VERIFY_USABLE_PDH(pdh); - SVN_ERR(pristine_cleanup_wcroot(pdh->wcroot, scratch_pool)); + SVN_ERR(svn_wc__db_pristine_cleanup_wcroot(pdh->wcroot, scratch_pool)); return SVN_NO_ERROR; } @@ -6007,6 +6147,14 @@ commit_node(void *baton, svn_sqlite__db_ /* ### other presences? or reserve that for separate functions? */ new_presence = svn_wc__db_status_normal; + /* First, delete any existing row. We do this explicitly because if we + * were to use an "INSERT OR REPLACE" statement SQLite would not fire the + * triggers for the deletion part of the replacement. */ + SVN_ERR(delete_nodes_row(cb->pdh->wcroot->sdb, + cb->pdh->wcroot->wc_id, cb->local_relpath, + (apr_int64_t)0 /* op_depth for BASE */)); + + /* ### Change stmt to fail if row exists? On failure: delete row, dec refcount, insert new row. */ SVN_ERR(svn_sqlite__get_statement(&stmt, cb->pdh->wcroot->sdb, STMT_APPLY_CHANGES_TO_BASE_NODE)); /* symlink_target not yet used */ @@ -9067,6 +9215,12 @@ set_new_dir_to_incomplete_txn(void *bato SVN_ERR(create_repos_id(&repos_id, dtb->repos_root_url, dtb->repos_uuid, sdb, scratch_pool)); + /* ### Why does this function need to overwrite an existing base row? It + * should be putting a base row underneath a locally-added row. */ + SVN_ERR(delete_nodes_row(dtb->pdh->wcroot->sdb, + dtb->pdh->wcroot->wc_id, dtb->local_relpath, + (apr_int64_t)0 /* op_depth for BASE */)); + SVN_ERR(svn_sqlite__get_statement(&stmt, dtb->pdh->wcroot->sdb, STMT_INSERT_NODE)); Index: subversion/libsvn_wc/wc_db_pdh.c =================================================================== --- subversion/libsvn_wc/wc_db_pdh.c (revision 1056311) +++ subversion/libsvn_wc/wc_db_pdh.c (working copy) @@ -26,6 +26,7 @@ #include #include "svn_dirent_uri.h" +#include "svn_pools.h" #include "wc.h" #include "adm_files.h" @@ -114,6 +115,14 @@ close_wcroot(void *data) SVN_ERR_ASSERT_NO_RETURN(wcroot->sdb != NULL); +#ifdef SVN_WC__PRISTINE_CLEANUP + { + apr_pool_t *scratch_pool = svn_pool_create(NULL); + svn_error_clear(svn_wc__db_pristine_cleanup_wcroot(wcroot, scratch_pool)); + svn_pool_destroy(scratch_pool); + } +#endif + err = svn_sqlite__close(wcroot->sdb); wcroot->sdb = NULL; if (err) Index: subversion/libsvn_wc/wc_db_private.h =================================================================== --- subversion/libsvn_wc/wc_db_private.h (revision 1056311) +++ subversion/libsvn_wc/wc_db_private.h (working copy) @@ -194,5 +194,10 @@ svn_wc__db_util_open_db(svn_sqlite__db_t apr_pool_t *result_pool, apr_pool_t *scratch_pool); +/* Remove all unreferenced pristines in WCROOT. */ +svn_error_t * +svn_wc__db_pristine_cleanup_wcroot(svn_wc__db_wcroot_t *wcroot, + apr_pool_t *scratch_pool); + #endif /* WC_DB_PDH_H */ Index: subversion/libsvn_wc/wc-metadata.sql =================================================================== --- subversion/libsvn_wc/wc-metadata.sql (revision 1056311) +++ subversion/libsvn_wc/wc-metadata.sql (working copy) @@ -479,6 +479,32 @@ CREATE TABLE NODES ( CREATE INDEX I_NODES_PARENT ON NODES (wc_id, parent_relpath, op_depth); +CREATE TRIGGER nodes_insert_trigger +AFTER INSERT ON nodes +/* WHEN NEW.checksum IS NOT NULL */ +BEGIN + UPDATE pristine SET refcount = refcount + 1 + WHERE checksum = NEW.checksum; +END; + +CREATE TRIGGER nodes_delete_trigger +AFTER DELETE ON nodes +/* WHEN OLD.checksum IS NOT NULL */ +BEGIN + UPDATE pristine SET refcount = refcount - 1 + WHERE checksum = OLD.checksum; +END; + +CREATE TRIGGER nodes_update_checksum_trigger +AFTER UPDATE OF checksum ON nodes +/* WHEN NEW.checksum IS NOT NULL OR OLD.checksum IS NOT NULL */ +BEGIN + UPDATE pristine SET refcount = refcount + 1 + WHERE checksum = NEW.checksum; + UPDATE pristine SET refcount = refcount - 1 + WHERE checksum = OLD.checksum; +END; + /* Format 20 introduces NODES and removes BASE_NODE and WORKING_NODE */ Index: subversion/libsvn_wc/wc-queries.sql =================================================================== --- subversion/libsvn_wc/wc-queries.sql (revision 1056311) +++ subversion/libsvn_wc/wc-queries.sql (working copy) @@ -133,7 +133,7 @@ SELECT id FROM repository WHERE root = ? INSERT INTO repository (root, uuid) VALUES (?1, ?2); -- STMT_INSERT_NODE -INSERT OR REPLACE INTO nodes ( +INSERT INTO nodes ( wc_id, local_relpath, op_depth, parent_relpath, repos_id, repos_path, revision, presence, depth, kind, changed_revision, changed_date, changed_author, checksum, properties, translated_size, last_mod_time, @@ -296,6 +296,10 @@ WHERE wc_id = ?1 AND local_relpath = ?2 AND right_checksum IS NULL AND left_checksum IS NULL; +-- STMT_DELETE_NODES_ROW +DELETE FROM nodes +WHERE wc_id = ?1 AND local_relpath = ?2 AND op_depth = ?3; + -- STMT_DELETE_BASE_NODE DELETE FROM nodes WHERE wc_id = ?1 AND local_relpath = ?2 AND op_depth = 0; @@ -411,7 +415,24 @@ DELETE FROM work_queue WHERE id = ?1; -- STMT_INSERT_PRISTINE INSERT OR IGNORE INTO pristine (checksum, md5_checksum, size, refcount) -VALUES (?1, ?2, ?3, 1); +VALUES (?1, ?2, ?3, 0); + +-- STMT_SELECT_ALL_PRISTINES +SELECT checksum, refcount +FROM pristine; + +-- STMT_SELECT_ALL_PRISTINE_REFERENCES +SELECT checksum, local_relpath, op_depth FROM nodes + WHERE checksum = ?1 /*OR checksum = ?2*/ +UNION ALL +SELECT older_checksum, local_relpath, -1 FROM actual_node + WHERE older_checksum = ?1 /*OR older_checksum = ?2*/ +UNION ALL +SELECT left_checksum, local_relpath, -2 FROM actual_node + WHERE left_checksum = ?1 /*OR left_checksum = ?2*/ +UNION ALL +SELECT right_checksum, local_relpath, -3 FROM actual_node + WHERE right_checksum = ?1 /*OR right_checksum = ?2*/ -- STMT_SELECT_PRISTINE_MD5_CHECKSUM SELECT md5_checksum @@ -504,7 +525,7 @@ WHERE wc_id = ?1 AND local_dir_relpath L /* translated_size and last_mod_time are not mentioned here because they will be tweaked after the working-file is installed. ### what to do about file_external? */ -INSERT OR REPLACE INTO nodes ( +INSERT INTO nodes ( wc_id, local_relpath, op_depth, parent_relpath, repos_id, repos_path, revision, presence, depth, kind, changed_revision, changed_date, changed_author, checksum, properties, dav_cache, symlink_target ) --=-NvzB4fyMg6xuWYSU/Sw7--