Return-Path: Delivered-To: apmail-subversion-dev-archive@minotaur.apache.org Received: (qmail 58424 invoked from network); 2 Jan 2011 15:58:45 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 2 Jan 2011 15:58:45 -0000 Received: (qmail 69494 invoked by uid 500); 2 Jan 2011 15:58:45 -0000 Delivered-To: apmail-subversion-dev-archive@subversion.apache.org Received: (qmail 69411 invoked by uid 500); 2 Jan 2011 15:58:43 -0000 Mailing-List: contact dev-help@subversion.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list dev@subversion.apache.org Received: (qmail 69398 invoked by uid 99); 2 Jan 2011 15:58:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 02 Jan 2011 15:58:43 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jcorvel@gmail.com designates 209.85.214.171 as permitted sender) Received: from [209.85.214.171] (HELO mail-iw0-f171.google.com) (209.85.214.171) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 02 Jan 2011 15:58:35 +0000 Received: by iwn2 with SMTP id 2so12914615iwn.16 for ; Sun, 02 Jan 2011 07:58:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=bHCkw1b1W2C3D/+axHTQikP5WwMrCCFac0gnkwb0oxY=; b=VAGIfvuNPFVTFtfuxayyF8mUGHC7l/SPe5k85umOUkFnBhzJ9SOUZbnEEpz8hlDxpu y3xL1qjXcMQbDKuepo6S0zK6a1k2sj0aeLs1zTZgE20GH88oShTV47AiDFQ43WfwkWR2 ZluA89ONAVHhQYKEsN9W7763gXNQH9ml0QrME= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=AA0XINsaHoZsQPWlT8Bia1oBWSjMAZzdyxEv4UZJ8TtADPyCHROSKHpKZKId6XwLA9 8sHOdSD+3TP5Qd8C4thPkr80y1Qu7KXtFhAbKrvqagqUVGo+Uf/49YBhr4HrMemxtQM1 tAJ5vqLvjV3EV1ctA8NPPlEIFrZRtwI6wLGaY= Received: by 10.42.170.199 with SMTP id g7mr20244266icz.149.1293983892655; Sun, 02 Jan 2011 07:58:12 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.14.75 with HTTP; Sun, 2 Jan 2011 07:57:52 -0800 (PST) In-Reply-To: <20110101203323.11D772388978@eris.apache.org> References: <20110101203323.11D772388978@eris.apache.org> From: Johan Corveleyn Date: Sun, 2 Jan 2011 16:57:52 +0100 Message-ID: Subject: Re: svn commit: r1054286 - /subversion/trunk/subversion/libsvn_delta/xdelta.c To: dev@subversion.apache.org Cc: Stefan Fuhrman Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Sat, Jan 1, 2011 at 9:33 PM, wrote: > Author: stefan2 > Date: Sat Jan =A01 20:33:22 2011 > New Revision: 1054286 > > URL: http://svn.apache.org/viewvc?rev=3D1054286&view=3Drev > Log: > XDelta calculation is major part of svn_txdelta_send_txstream. > Therefore, speed up string matching code and the relatively > expensive hash-code (adler32) calculations. > > * subversion/libsvn_delta/xdelta.c > =A0(init_adler32): init adler checksum with 0 instead of 1; > =A0 use svn_adler32 for performance > =A0(adler32_out): "-1" can / must be ommitted now that we > =A0 start at 0 instead of 1 for s1. > =A0(adler32_replace): special, faster implementation replacing > =A0 a adler32_out(), adler32_in() sequence > > =A0(match_length): new utility function > =A0(find_match): speed up the main loop by calling match_length() > =A0(compute_delta): optimize adler32 calculations > > Modified: > =A0 =A0subversion/trunk/subversion/libsvn_delta/xdelta.c > > Modified: subversion/trunk/subversion/libsvn_delta/xdelta.c > URL: http://svn.apache.org/viewvc/subversion/trunk/subversion/libsvn_delt= a/xdelta.c?rev=3D1054286&r1=3D1054285&r2=3D1054286&view=3Ddiff > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D > --- subversion/trunk/subversion/libsvn_delta/xdelta.c (original) > +++ subversion/trunk/subversion/libsvn_delta/xdelta.c Sat Jan =A01 20:33:= 22 2011 > @@ -29,6 +29,8 @@ > > =A0#include "svn_delta.h" > =A0#include "delta.h" > + > +#include "private/svn_adler32.h" > > =A0/* This is pseudo-adler32. It is adler32 without the prime modulus. > =A0 =A0The idea is borrowed from monotone, and is a translation of the C+= + > @@ -39,6 +41,10 @@ > =A0#define ADLER32_MASK =A0 =A0 =A00x0000ffff > =A0#define ADLER32_CHAR_MASK 0x000000ff > > +/* Size of the blocks we compute checksums for. This was chosen out of > + =A0 thin air. =A0Monotone used 64, xdelta1 used 64, rsync uses 128. =A0= */ > +#define MATCH_BLOCKSIZE 64 > + > =A0/* Structure to store the state of our adler32 checksum. =A0*/ > =A0struct adler32 > =A0{ > @@ -66,11 +72,32 @@ adler32_out(struct adler32 *ad, const ch > =A0{ > =A0 ad->s1 -=3D ((apr_uint32_t) (c)) & ADLER32_CHAR_MASK; > =A0 ad->s1 &=3D ADLER32_MASK; > - =A0ad->s2 -=3D (ad->len * (((apr_uint32_t) c) & ADLER32_CHAR_MASK)) + 1= ; > + =A0ad->s2 -=3D (ad->len * (((apr_uint32_t) c) & ADLER32_CHAR_MASK)); > =A0 ad->s2 &=3D ADLER32_MASK; > =A0 --ad->len; > =A0} > > +/* Feed C_IN into the adler32 checksum and remove C_OUT at the same time= . > + * This function may (and will) only be called for > + * ad->len =3D=3D MATCH_BLOCKSIZE. > + */ > +static APR_INLINE void > +adler32_replace(struct adler32 *ad, const char c_out, const char c_in) > +{ > + =A0apr_uint32_t s1 =3D ad->s1; > + =A0apr_uint32_t s2 =3D ad->s2; > + > + =A0s2 -=3D (MATCH_BLOCKSIZE * (((apr_uint32_t) c_out) & ADLER32_CHAR_MA= SK)); > + > + =A0s1 -=3D ((apr_uint32_t) (c_out)) & ADLER32_CHAR_MASK; > + =A0s1 +=3D ((apr_uint32_t) (c_in)) & ADLER32_CHAR_MASK; > + > + =A0s2 +=3D s1; > + > + =A0ad->s1 =3D s1 & ADLER32_MASK; > + =A0ad->s2 =3D s2 & ADLER32_MASK; > +} > + > =A0/* Return the current adler32 checksum in the adler32 structure. =A0*/ > > =A0static APR_INLINE apr_uint32_t > @@ -85,18 +112,15 @@ adler32_sum(const struct adler32 *ad) > =A0static APR_INLINE struct adler32 * > =A0init_adler32(struct adler32 *ad, const char *data, apr_uint32_t datale= n) > =A0{ > - =A0ad->s1 =3D 1; > - =A0ad->s2 =3D 0; > - =A0ad->len =3D 0; > - =A0while (datalen--) > - =A0 =A0adler32_in(ad, *(data++)); > + =A0apr_uint32_t adler32 =3D svn__adler32(0, data, datalen); > + > + =A0ad->s1 =3D adler32 & ADLER32_MASK; > + =A0ad->s2 =3D (adler32 >> 16) & ADLER32_MASK; > + =A0ad->len =3D datalen; > + > =A0 return ad; > =A0} > > -/* Size of the blocks we compute checksums for. This was chosen out of > - =A0 thin air. =A0Monotone used 64, xdelta1 used 64, rsync uses 128. =A0= */ > -#define MATCH_BLOCKSIZE 64 > - > =A0/* Information for a block of the delta source. =A0The length of the > =A0 =A0block is the smaller of MATCH_BLOCKSIZE and the difference between > =A0 =A0the size of the source data and the position of this block. */ > @@ -201,6 +225,35 @@ init_blocks_table(const char *data, > =A0 =A0 } > =A0} > > +/* Return the lowest position at which A and B differ. If no difference > + * can be found in the first MAX_LEN characters, MAX_LEN will be returne= d. > + */ > +static apr_size_t > +match_length(const char *a, const char *b, apr_size_t max_len) > +{ > + =A0apr_size_t pos =3D 0; > + > +#if SVN_UNALIGNED_ACCESS_IS_OK > + > + =A0/* Chunky operation is so much faster ... > + =A0 * > + =A0 * We can't make this work on architectures that require aligned acc= ess > + =A0 * because A and B will probably have different alignment. So, skipp= ing > + =A0 * the first few chars until alignment is reached is not an option. > + =A0 */ > + =A0for (; pos + sizeof(apr_size_t) <=3D max_len; pos +=3D sizeof(apr_si= ze_t)) > + =A0 =A0if (*(const apr_size_t*)(a + pos) !=3D *(const apr_size_t*)(b + = pos)) > + =A0 =A0 =A0break; > + > +#endif > + > + =A0for (; pos < max_len; ++pos) > + =A0 =A0if (a[pos] !=3D b[pos]) > + =A0 =A0 =A0break; > + > + =A0return pos; > +} > + > =A0/* Try to find a match for the target data B in BLOCKS, and then > =A0 =A0extend the match as long as data in A and B at the match position > =A0 =A0continues to match. =A0We set the position in a we ended up in (in > @@ -229,6 +282,8 @@ find_match(const struct blocks *blocks, > =A0 apr_uint32_t sum =3D adler32_sum(rolling); > =A0 apr_size_t alen, badvance, apos; > =A0 apr_size_t tpos, tlen; > + =A0apr_size_t delta, max_delta; > + =A0const char *aptr, *bptr; ..\..\..\subversion\libsvn_delta\xdelta.c(286): warning C4101: 'aptr' : unreferenced local variable ..\..\..\subversion\libsvn_delta\xdelta.c(286): warning C4101: 'bptr' : unreferenced local variable --=20 Johan