harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Laurie <...@algroup.co.uk>
Subject Re: optimization for speed in win32
Date Sat, 03 Dec 2005 17:44:16 GMT
Blast from the past here...

Enrico Migliore wrote:
> Hi,
> 
> I did some tests  in order to see which, among __fastcall, __stdcall,
> __cdecl, __inline,
> gives the fastest execution time:  my_functions() was called 300000000
> times
> 
> Processor = Intel Pentium 1.4 GHz
> OS = Windows 2000
> Compiler = Microsoft Visual C
> Executable type = release
> 
> ---------------------- code ------------------------------
> #include <stdio.h>
> #include <time.h>
> 
> /*
> *  qualifier: __fastcall, __stdcall, __cdecl, __inline
> */
> qualifier int my_function (int a, char *buf)
> {
> 
>    int  int_1 = 1987;
>    char char_1 = 'a';
>    long long_1 = 123456789L;
>    long long_2 = &long_1;
> 
>    if (a > 0)
>    {
>        a++;
>    }
>    else
>    {
>        a--;
>    }
>    if (int_1 > 9)
>    {
>        char_1++;
>    }
>    else
>    {
>        char_1--;
>    }
>    *long_2++;
>    if (buf == NULL)
>    {
>        return -1;
>    }
>    buf++;
>    return a;
> }
> 
> 
> double test (void)
> {
> 
>    long i;
>    int result;
>    char buf[8];
>    clock_t start;
>    clock_t stop;
>    double  duration;
> 
>    buf[0] = 0;
> 
>    start = clock();       for (i = 0; i < 300000000; i++)
>    {
>         result = my_function(10,buf);
>    }
>    stop = clock();
>    duration = (double) (stop - start) / CLOCKS_PER_SEC;
> 
>    return duration;
> }
> 
> 
> Results:
> 
> --------------------------------------------------------
> qualifier          |  test duration (maximize for speed = OFF)
> --------------------------------------------------------
> __fastcall       |     14.3900 seconds
> __stdcall        |     12.9700 seconds
> __cdecl (**)  |     13.6200 seconds
> __inline          |      12.9600 seconds
> 
> --------------------------------------------------------
> qualifier          |  test duration (maximize for speed = ON)
> --------------------------------------------------------
> __fastcall        |      5.9400 seconds
> __stdcall         |       6.5300 seconds
> __cdecl (**)   |       6.4800 seconds
> __inline           |       0.0000 seconds (*)
> 
> (*) suspicious

Actually, this is not suspicious - if the function is inlined, the
compiler can optimise the loop away. This is a common flaw with
benchmark code - not as smart as the compiler is :-)

> (**) __cdecl is the default qualifier
> 
> 
> 
> Conclusion:
> 
> __inline and __fastcall give the best results
> when the compiler is instructed to generate code
> with the flag "optimize for speed" enabled.
> 
> I also noticed that parameters marshalling is not an issue
> because the compiler will link the appropriate static library according
> to the optimization selected. Therefore, in win32, we should
> be able to optimize our C code for speed  without problems.
> 
> Enrico
> 
> 


-- 
http://www.apache-ssl.org/ben.html       http://www.thebunker.net/
**  ApacheCon - Dec 10-14th - San Diego - http://apachecon.com/ **
"There is no limit to what a man can do or how far he can go if he
doesn't mind who gets the credit." - Robert Woodruff

Mime
View raw message