Return-Path: Delivered-To: apmail-apr-dev-archive@www.apache.org Received: (qmail 28186 invoked from network); 27 Apr 2007 06:41:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 27 Apr 2007 06:41:48 -0000 Received: (qmail 54458 invoked by uid 500); 27 Apr 2007 06:41:54 -0000 Delivered-To: apmail-apr-dev-archive@apr.apache.org Received: (qmail 54087 invoked by uid 500); 27 Apr 2007 06:41:53 -0000 Mailing-List: contact dev-help@apr.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Id: Delivered-To: mailing list dev@apr.apache.org Received: (qmail 54076 invoked by uid 99); 27 Apr 2007 06:41:53 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Apr 2007 23:41:53 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of lucian.grijincu@gmail.com designates 64.233.162.227 as permitted sender) Received: from [64.233.162.227] (HELO nz-out-0506.google.com) (64.233.162.227) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Apr 2007 23:41:45 -0700 Received: by nz-out-0506.google.com with SMTP id s1so342352nze for ; Thu, 26 Apr 2007 23:41:25 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=YOSqkZEKZDq4rrV8RmEEe+l1gStGiNtFUu7I1p0iVGUq4qaH0nSdOjkNp8t+6P05byMcdKpLB89+I7/7x91L6v8xoEnv/EiFC5lKHh/jonLOkDJncdC+HkfQMmF7JBbkE1oyGPFoz6YYAKYhFczgYeT02IcSLTHBYw8t+PctBH4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=FAiYxRZ+w05REaRynxBtSHbgqKFGH/lLyrjJl/rVvDnrjt7fbjMMCXs8G/hcVRqJ9QWtOdrE2KuD7XIbsO22pufgjtzO84U73KWcyph1icj/0lTGbDyBX7FsMR678gE9AD8qj8LJO5HqqCxuad1++XDwRrusTXrCFGPjseaU0lU= Received: by 10.114.201.1 with SMTP id y1mr893587waf.1177656084492; Thu, 26 Apr 2007 23:41:24 -0700 (PDT) Received: by 10.114.102.10 with HTTP; Thu, 26 Apr 2007 23:41:24 -0700 (PDT) Message-ID: Date: Fri, 27 Apr 2007 09:41:24 +0300 From: "Lucian Adrian Grijincu" To: "Davi Arnaut" Subject: Re: [PATCH] vformatter cleanups (related to PR 42250) Cc: dev@apr.apache.org In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_73971_32288158.1177656084222" References: <46311BB5.3010507@haxent.com.br> <4d45da050704262014r26d16b62q371fb4ee8b8a6c7d@mail.gmail.com> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_73971_32288158.1177656084222 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline would someone please try this, I don't have a x64 machine right now :( static char *conv_3(char *p, uint32_t magnitude) { do { *--p = (char) '0' + (magnitude % 10); } while ((magnitude /= 10) != 0); return p; } on x86-32bit I got this: conv_1 cycles: 64 cycles: 47 cycles: 47 cycles: 18 cycles: 18 cycles: 18 cycles: 18 cycles: 18 conv_2 cycles: 62 cycles: 77 cycles: 62 cycles: 18 cycles: 18 cycles: 18 cycles: 18 cycles: 18 conv_3 cycles: 62 cycles: 40 cycles: 40 cycles: 18 cycles: 18 cycles: 18 cycles: 18 cycles: 18 btw, I get an order of magnitude lower numbers than you on my machine, am I doing something wrong? gcc version 4.1.2 /proc/cpuinfo: model name : AMD Athlon(tm) 64 Processor 2800+ cpu MHz : 1800.000 cache size : 512 KB flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow up ts fid vid ttp bogomips : 3609.09 clflush size : 64 -- Lucian Adrian Grijincu On 4/27/07, Davi Arnaut wrote: > On 27/04/2007, at 00:14, Lucian Adrian Grijincu wrote: > > > in apr-conv10-faster.patch you added: > > > > static const char digits[] = "0123456789"; > > *--p = digits[magnitude % 10]; > > > > Why is this faster than: > > *--p = (char) '0' + (magnitude % 10); ? > > You have to take into account the entire loop. The fowling: > > do { > u_widest_int new_magnitude = magnitude / 10; > *--p = (char) (magnitude - new_magnitude * 10 + '0'); > magnitude = new_magnitude; > } while (magnitude); > > against: > > do { > *--p = digits[magnitude % 10]; > } while ((magnitude /= 10) != 0); > > digits is easily cacheable, fewer assignments. > > > > > For your "faster" version, under the hood, the C compiler adds > > (magnitude % 10) to the address of digits and then copies the contents > > of the memory location represented by the sum's result into *--p. > > > > My version just adds (magnitude % 10) to '0' and stores the result > > in *--p. > > Talk is cheap, let's benchmark! To see the generated assembly: > > gcc -O2 -o bench bench.c -g > objdump -S bench > bench-asm > > # Intel(R) Celeron(R) CPU 2.20GHz > [davi@montefiori ~]$ gcc -o bench bench.c -O2 # uint32_t > [davi@montefiori ~]$ ./bench $RANDOM > conv_1 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > conv_2 > cycles: 236 > cycles: 220 > cycles: 224 > cycles: 220 > cycles: 224 > cycles: 224 > cycles: 224 > cycles: 224 > conv_1 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > cycles: 236 > conv_2 > cycles: 220 > cycles: 224 > cycles: 224 > cycles: 224 > cycles: 224 > cycles: 224 > cycles: 224 > cycles: 224 > > [davi@montefiori ~]$ gcc -o bench bench.c -O2 # uint64_t > [davi@montefiori ~]$ ./bench $RANDOM |more > conv_1 > cycles: 508 > cycles: 532 > cycles: 540 > cycles: 468 > cycles: 468 > cycles: 468 > cycles: 468 > conv_2 > cycles: 1188 > cycles: 824 > cycles: 896 > cycles: 828 > cycles: 824 > cycles: 824 > cycles: 820 > conv_1 > cycles: 524 > cycles: 492 > cycles: 468 > cycles: 504 > cycles: 468 > cycles: 504 > cycles: 468 > conv_2 > cycles: 768 > cycles: 836 > cycles: 836 > cycles: 820 > cycles: 820 > cycles: 820 > cycles: 820 > > > > Am I missing something here? > > Both code, after compiler optimizations, yield similar results but > hurts uint64_t (apr_uint64_t) case quite a bit. "Faster" was a > overstatement, I withdraw apr-conv10-faster.patch. > > -- > Davi Arnaut > > > ------=_Part_73971_32288158.1177656084222 Content-Type: text/x-csrc; name=bench.c; charset=ANSI_X3.4-1968 Content-Transfer-Encoding: base64 X-Attachment-Id: f_f109sjpq Content-Disposition: attachment; filename="bench.c" I2luY2x1ZGUgPHNpZ25hbC5oPgojaW5jbHVkZSA8c3RkaW8uaD4KI2luY2x1ZGUgPHRpbWUuaD4K I2luY2x1ZGUgPHN0ZGludC5oPgoKI2RlZmluZSByZHRzY2xsKHZhbCkgX19hc21fXyBfX3ZvbGF0 aWxlX18oInJkdHNjIiA6ICI9QSIgKHZhbCkpCgpzdGF0aWMgY29uc3QgY2hhciBkaWdpdHNbXSA9 ICIwMTIzNDU2Nzg5IjsKCnN0YXRpYyBjaGFyICpjb252XzEoY2hhciAqcCwgdWludDMyX3QgbWFn bml0dWRlKQp7CiAgICAgZG8gewogICAgICAgIHVpbnQzMl90IG5ld19tYWduaXR1ZGUgPSBtYWdu aXR1ZGUgLyAxMDsKICAgICAgICAqLS1wID0gKGNoYXIpIChtYWduaXR1ZGUgLSBuZXdfbWFnbml0 dWRlICogMTAgKyAnMCcpOwogICAgICAgIG1hZ25pdHVkZSA9IG5ld19tYWduaXR1ZGU7CiAgICB9 IHdoaWxlIChtYWduaXR1ZGUpOwoKICAgIHJldHVybiBwOwp9CgpzdGF0aWMgY2hhciAqY29udl8y KGNoYXIgKnAsIHVpbnQzMl90IG1hZ25pdHVkZSkKewogICAgZG8gewogICAgICAgICotLXAgPSBk aWdpdHNbbWFnbml0dWRlICUgMTBdOwogICAgfSB3aGlsZSAoKG1hZ25pdHVkZSAvPSAxMCkgIT0g MCk7CgogICAgcmV0dXJuIHA7Cn0KCnN0YXRpYyBjaGFyICpjb252XzMoY2hhciAqcCwgdWludDMy X3QgbWFnbml0dWRlKQp7CiAgICBkbyB7CiAgICAgICAgKi0tcCA9IChjaGFyKSAnMCcgKyAobWFn bml0dWRlICUgMTApOwogICAgfSB3aGlsZSAoKG1hZ25pdHVkZSAvPSAxMCkgIT0gMCk7CgogICAg cmV0dXJuIHA7Cn0KCnN0YXRpYyB2b2lkIGJlbmNoXzEodW5zaWduZWQgaW50IGl0ZXIsIHVpbnQz Ml90IG51bSkKewogICAgY2hhciBidWZbNTEyXSwgKnAgPSBidWYrNTEyOwogICAgdW5zaWduZWQg bG9uZyBsb25nIHRzLCB0ZTsKCiAgICBwdXRzKCJjb252XzEiKTsKCiAgICB3aGlsZSAoaXRlci0t KSB7CiAgICAgICAgcmR0c2NsbCh0cyk7CiAgICAgICAgY29udl8xKGJ1ZiwgbnVtKTsKICAgICAg ICByZHRzY2xsKHRlKTsKICAgICAgICBwcmludGYoImN5Y2xlczogJTEybGx1XG4iLCB0ZSAtIHRz KTsKICAgIH0KfQoKc3RhdGljIHZvaWQgYmVuY2hfMih1bnNpZ25lZCBpbnQgaXRlciwgdWludDMy X3QgbnVtKQp7CiAgICBjaGFyIGJ1Zls1MTJdLCAqcCA9IGJ1Zis1MTI7CiAgICB1bnNpZ25lZCBs b25nIGxvbmcgdHMsIHRlOwoKICAgIHB1dHMoImNvbnZfMiIpOwoKICAgIHdoaWxlIChpdGVyLS0p IHsKICAgICAgICByZHRzY2xsKHRzKTsKICAgICAgICBjb252XzIocCwgbnVtKTsKICAgICAgICBy ZHRzY2xsKHRlKTsKICAgICAgICBwcmludGYoImN5Y2xlczogJTEybGx1XG4iLCB0ZSAtIHRzKTsK ICAgIH0KfQoKc3RhdGljIHZvaWQgYmVuY2hfMyh1bnNpZ25lZCBpbnQgaXRlciwgdWludDMyX3Qg bnVtKQp7CiAgICBjaGFyIGJ1Zls1MTJdLCAqcCA9IGJ1Zis1MTI7CiAgICB1bnNpZ25lZCBsb25n IGxvbmcgdHMsIHRlOwoKICAgIHB1dHMoImNvbnZfMyIpOwoKICAgIHdoaWxlIChpdGVyLS0pIHsK ICAgICAgICByZHRzY2xsKHRzKTsKICAgICAgICBjb252XzMocCwgbnVtKTsKICAgICAgICByZHRz Y2xsKHRlKTsKICAgICAgICBwcmludGYoImN5Y2xlczogJTEybGx1XG4iLCB0ZSAtIHRzKTsKICAg IH0KfQoKaW50IG1haW4oaW50IGFyZ2MsIGNoYXIgKmFyZ3ZbXSkKewogICAgdWludDMyX3QgbnVt ID0gYXRvaShhcmd2WzFdKTsKCiAgICBiZW5jaF8xKDgsIG51bSk7CiAgICBiZW5jaF8yKDgsIG51 bSk7CiAgICBiZW5jaF8zKDgsIG51bSk7CiAgICBiZW5jaF8xKDgsIG51bSk7CiAgICBiZW5jaF8y KDgsIG51bSk7CiAgICBiZW5jaF8zKDgsIG51bSk7Cn0K ------=_Part_73971_32288158.1177656084222--