Return-Path: Delivered-To: apmail-incubator-stdcxx-dev-archive@www.apache.org Received: (qmail 13582 invoked from network); 10 Feb 2006 19:16:54 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 10 Feb 2006 19:16:54 -0000 Received: (qmail 6410 invoked by uid 500); 10 Feb 2006 19:16:54 -0000 Delivered-To: apmail-incubator-stdcxx-dev-archive@incubator.apache.org Received: (qmail 6362 invoked by uid 500); 10 Feb 2006 19:16:53 -0000 Mailing-List: contact stdcxx-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: stdcxx-dev@incubator.apache.org Delivered-To: mailing list stdcxx-dev@incubator.apache.org Received: (qmail 6351 invoked by uid 99); 10 Feb 2006 19:16:53 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Feb 2006 11:16:53 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [12.17.213.84] (HELO bco-exchange.bco.roguewave.com) (12.17.213.84) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Feb 2006 11:16:52 -0800 Received: from [10.70.3.48] (10.70.3.48 [10.70.3.48]) by bco-exchange.bco.roguewave.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2657.72) id ZGW28T00; Fri, 10 Feb 2006 12:16:39 -0700 Message-ID: <43ECE5A3.1070609@roguewave.com> Date: Fri, 10 Feb 2006 12:12:35 -0700 From: Andrew Black User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050414 X-Accept-Language: en-us, en MIME-Version: 1.0 To: stdcxx-dev@incubator.apache.org Subject: Benchmarking stdcxx Content-Type: multipart/mixed; boundary="------------020307050701040902030101" X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N --------------020307050701040902030101 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Greetings all. I thought it might be interesting to do some benchmarking, comparing the performance of stdcxx with other standard libraries. As there are a number of attributes that can be compared when doing a benchmark, and an even larger number of classes that can be looked at, there is a fair amount of choice in what to measure. As a starting point, I chose to measure the runtime performace of stringstream objects. Measurements were taken on my linux box (a 1.9 GHz P4), with a light load (number of running applications, but most were idle) and an 8d (single threaded, release, shared) version of stdcxx. Each test was run 5 times in a row, with a count of 500000 iterations. The following table lists the run times collected. All times are in seconds. +-------------------+---------------+----------------+ | test name | gcc 3.2.3 | stdcxx 4.1.3 | +-------------------+-------+-------+--------+-------+ | | usr | sys | usr | sys | +-------------------+-------+-------+--------+-------+ | read_single | 8.977 | 0.008 | 13.997 | 0.012 | | | 7.856 | 0.008 | 13.913 | 0.016 | | | 8.021 | 0.012 | 13.817 | 0.024 | | | 7.736 | 0.020 | 28.634 | 0.016 | | | 7.844 | 0.012 | 13.841 | 0.016 | +-------------------+-------+-------+--------+-------+ | read_multi | 0.608 | 0.744 | 0.864 | 0.756 | | | 0.688 | 0.704 | 0.860 | 0.736 | | | 0.660 | 0.728 | 0.856 | 0.712 | | | 0.608 | 0.792 | 0.848 | 0.724 | | | 0.552 | 0.796 | 0.796 | 0.780 | +-------------------+-------+-------+--------+-------+ | write_single | 1.976 | 0.000 | 30.450 | 0.048 | | | 2.356 | 0.012 | 30.526 | 0.064 | | | 1.984 | 0.000 | 30.354 | 0.032 | | | 1.964 | 0.012 | 30.350 | 0.028 | | | 1.936 | 0.000 | 30.286 | 0.036 | +-------------------+-------+-------+--------+-------+ | write_multi | 1.172 | 2.352 | 32.326 | 2.320 | | | 1.092 | 2.444 | 31.102 | 2.216 | | | 1.164 | 2.360 | 30.482 | 2.248 | | | 1.148 | 2.380 | 31.930 | 2.180 | | | 1.000 | 2.532 | 29.534 | 2.272 | +-------------------+-------+-------+--------+-------+ | read_write_single | 7.684 | 0.000 | 13.649 | 0.016 | | | 7.684 | 0.012 | 13.685 | 0.016 | | | 7.664 | 0.012 | 14.193 | 0.016 | | | 8.353 | 0.012 | 13.745 | 0.016 | | | 7.700 | 0.012 | 13.677 | 0.004 | +-------------------+-------+-------+--------+-------+ | read_write_cycle | 0.056 | 0.000 | 0.412 | 0.000 | | | 0.056 | 0.000 | 0.424 | 0.004 | | | 0.056 | 0.000 | 0.428 | 0.004 | | | 0.056 | 0.000 | 0.420 | 0.004 | | | 0.056 | 0.000 | 0.412 | 0.004 | +-------------------+-------+-------+--------+-------+ | read_write_multi | 0.664 | 0.732 | 1.028 | 0.716 | | | 0.676 | 0.712 | 0.988 | 0.744 | | | 0.632 | 0.752 | 1.036 | 0.716 | | | 0.688 | 0.704 | 1.080 | 0.732 | | | 0.632 | 0.732 | 0.940 | 0.804 | +-------------------+-------+-------+--------+-------+ | write_read_single | 7.868 | 0.016 | 43.407 | 0.044 | | | 7.896 | 0.012 | 43.895 | 0.044 | | | 7.888 | 0.008 | 43.307 | 0.076 | | | 7.912 | 0.012 | 43.391 | 0.032 | | | 8.337 | 0.016 | 43.375 | 0.044 | +-------------------+-------+-------+--------+-------+ | write_read_cycle | 0.056 | 0.000 | 0.412 | 0.004 | | | 0.056 | 0.000 | 0.404 | 0.016 | | | 0.056 | 0.000 | 0.412 | 0.000 | | | 0.056 | 0.000 | 0.420 | 0.000 | | | 0.052 | 0.004 | 0.416 | 0.004 | +-------------------+-------+-------+--------+-------+ | write_read_multi | 7.340 | 2.404 | 43.591 | 2.408 | | | 7.420 | 2.400 | 42.347 | 2.196 | | | 7.440 | 2.376 | 45.227 | 2.336 | | | 7.232 | 2.476 | 43.679 | 2.316 | | | 7.348 | 2.488 | 44.271 | 2.348 | +-------------------+-------+-------+--------+-------+ Analysis: Using the numbers above, I did some basic analysis. System times spent for a given test appear to be roughly the same, so I am overlooking those numbers at this time. To look at these numbers, I see two or three stastical operations that could be of use. The first operation is the arithmatic average ('average') of the numbers. This is the 'classic' sum and divide average. The second operation is the medan value (middle number) in the set. The final operation is what I term the 'middle average'. I calculate this by throwing out the highest and lowest value, then calculating the arithmatic average of the remaining numbers. In the tables below, ratio indicates how much longer the stdcxx runs take compared to the gcc runs, with 0% indicating they take the same amount of time. +-------------------+-------+--------+----------+ | read_single | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 8.087 | 16.840 | 108.25% | +-------------------+-------+--------+----------+ | middle average | 7.907 | 13.917 | 76.01% | +-------------------+-------+--------+----------+ | medan | 7.856 | 13.913 | 77.10% | +-------------------+-------+--------+----------+ +-------------------+-------+--------+----------+ | read_multi | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 0.623 | 0.845 | 35.56% | +-------------------+-------+--------+----------+ | middle average | 0.625 | 0.855 | 36.67% | +-------------------+-------+--------+----------+ | medan | 0.608 | 0.856 | 40.79% | +-------------------+-------+--------+----------+ +-------------------+-------+--------+----------+ | write_single | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 2.043 | 30.393 | 1387.53% | +-------------------+-------+--------+----------+ | middle average | 1.975 | 30.385 | 1438.72% | +-------------------+-------+--------+----------+ | medan | 1.976 | 30.354 | 1436.13% | +-------------------+-------+--------+----------+ +-------------------+-------+--------+----------+ | write_multi | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 1.115 | 31.075 | 2686.48% | +-------------------+-------+--------+----------+ | middle average | 1.135 | 31.171 | 2647.18% | +-------------------+-------+--------+----------+ | medan | 1.148 | 31.102 | 2609.23% | +-------------------+-------+--------+----------+ +-------------------+-------+--------+----------+ | read_write_single | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 7.817 | 13.790 | 76.41% | +-------------------+-------+--------+----------+ | middle average | 7.689 | 13.720 | 78.20% | +-------------------+-------+--------+----------+ | medan | 7.684 | 13.685 | 78.10% | +-------------------+-------+--------+----------+ +-------------------+-------+--------+----------+ | read_write_cycle | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 0.056 | 0.419 | 648.57% | +-------------------+-------+--------+----------+ | middle average | 0.056 | 0.419 | 647.62% | +-------------------+-------+--------+----------+ | medan | 0.056 | 0.420 | 650.00% | +-------------------+-------+--------+----------+ +-------------------+-------+--------+----------+ | read_write_multi | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 0.658 | 1.014 | 54.07% | +-------------------+-------+--------+----------+ | middle average | 0.657 | 1.017 | 54.77% | +-------------------+-------+--------+----------+ | medan | 0.664 | 1.028 | 54.82% | +-------------------+-------+--------+----------+ +-------------------+-------+--------+----------+ | write_read_single | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 7.980 | 43.475 | 444.79% | +-------------------+-------+--------+----------+ | middle average | 7.899 | 43.391 | 449.35% | +-------------------+-------+--------+----------+ | medan | 7.896 | 43.391 | 449.53% | +-------------------+-------+--------+----------+ +-------------------+-------+--------+----------+ | write_read_cycle | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 0.055 | 0.413 | 647.83% | +-------------------+-------+--------+----------+ | middle average | 0.056 | 0.413 | 638.10% | +-------------------+-------+--------+----------+ | medan | 0.056 | 0.412 | 635.71% | +-------------------+-------+--------+----------+ +-------------------+-------+--------+----------+ | write_read_multi | gcc | stdcxx | ratio | +-------------------+-------+--------+----------+ | average | 7.356 | 43.823 | 495.74% | +-------------------+-------+--------+----------+ | middle average | 7.369 | 43.847 | 494.99% | +-------------------+-------+--------+----------+ | medan | 7.348 | 43.679 | 494.43% | +-------------------+-------+--------+----------+ Conclusions: Looking over the processed numbers from the runs, one thing that jumps out at me is the write times, particularly the write_single and write_multi benchmarks. Both of these benchmarks are an order of magnitude slower than their GCC counterparts (at least on this computer). The write_multi benchmark in particular shows what happens if you stream large amounts of data (~250 MB worth of data in this case) into a strstream, without streaming any out. Future: For those interested in trying to repeat these tests, I have attached the source and makefile files I used to generate these benchmarks. This particular benchmark is a work in progress. There are several additional things that could be benchmarked regarding stringstreams. These include allocation (default, string, copy), pseudo-random read/writes (rather than pattern read/writes), reads and writes of varying length strings, and reading/writing using something other than the insertion and extraction operators. --Andrew Black --------------020307050701040902030101 Content-Type: text/plain; name="stringstream_bm.cpp" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="stringstream_bm.cpp" /*************************************************************************** * * stringstream_bm.cpp - simple benchmarking program for std::stringstreams * * $Id$ * *************************************************************************** * * Copyright (c) 2006 Quovadx, Inc., acting through its Rogue Wave * Software division. Licensed under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance with the * License. You may obtain a copy of the License at * http://www.apache.org/licenses/LICENSE-2.0. Unless required by * applicable law or agreed to in writing, software distributed under * the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR * CONDITIONS OF ANY KIND, either express or implied. See the License * for the specific language governing permissions and limitations under * the License. * **************************************************************************/ #include #include #include static const char sdata[] = "abc"; static const char mdata[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"; static const char ldata[] = { // ...:....1....:....2....: "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "ABCDEFGHIJKLMNOPQRSTUVWXYZ" }; const unsigned sinkSize = 550;//ldata is 521 /************************************************************************** * A note on naming conventions * * Each 'testlet' is dispatched to from the main() function, based on the * contents of the ftable, which basically associates the function name * with the function pointer. * * Function names are comprised of a prefix and a suffix, each indicating * the behavior of the function. * * Prefixes: * read_ uses operator >> to read from a prefilled stringstream * write_ uses operator << to write to an empty stringstream * read_write_ first reads from a stringstream, then writes back to it * write_read_ first writes to a stringstream, then reads back from it * * Suffixes: * _single versions operate on a stringstream with a lifespan of a single * test iteration. * _cycle versions operate on the same stringstream for the lifetime of * the test. Operations alternate between read and write. * _multi versions operate on the same stringstream for the lifetime of * the test. Operations on the stringstream are batched with all * reads and writes happening in a row * * Within each function, there are a few names that are commonly used * The name 'N' contains the number of test itterations to run * The name 'i' is the counter for the test itteration loop * The name 'hold' is used for stringstreams that are written to and read * from. * The name 'sink' is used for the structure (char[] or stringstream) that * is repeatedly written to without being read from * The name 'ldata' is a global constant that is used to fill data * structures during the execution of the test. **************************************************************************/ static void read_single(int N){ char sink[sinkSize]; for (int i = 0; i < N; ++i) { std::stringstream hold(ldata); hold >> sink; } } static void read_multi(int N){ char sink[sinkSize]; const unsigned ssize=sizeof(ldata); const unsigned dsize=ssize+1; //for \n char pad[] ="\n"; int i; char* source=(char*)malloc(dsize*N); for (i = 0; i < N; ++i) { strcpy(source+dsize*i, ldata); strcpy(source+dsize*i+ssize-1, pad); } std::stringstream hold(source); free(source); for (i = 0; i < N; ++i) { hold >> sink; } } static void write_single(int N){ for (int i = 0; i < N; ++i) { std::stringstream sink; sink << ldata; } } static void write_multi(int N){ std::stringstream sink; for (int i = 0; i < N; ++i) { sink << ldata; } } static void read_write_single(int N){ char sink[sinkSize]; for (int i = 0; i < N; ++i) { std::stringstream hold(ldata); hold >> sink; hold << ldata; } } static void read_write_cycle(int N){ char sink[sinkSize]; std::stringstream hold(ldata); for (int i = 0; i < N; ++i) { hold >> sink; hold << ldata; } } static void read_write_multi(int N){ char sink[sinkSize]; const unsigned ssize=sizeof(ldata); const unsigned dsize=ssize+1; //for \n char pad[] ="\n"; int i; char* source=(char*)malloc(dsize*N); for (i = 0; i < N; ++i) { strcpy(source+dsize*i, ldata); strcpy(source+dsize*i+ssize-1, pad); } std::stringstream hold(source); free(source); for (i = 0; i < N; ++i) { hold >> sink; } for (i = 0; i < N; ++i) { hold << ldata << ' '; } } static void write_read_single(int N){ char sink[sinkSize]; for (int i = 0; i < N; ++i) { std::stringstream hold; hold << ldata; hold >> sink; } } static void write_read_cycle(int N){ char sink[sinkSize]; std::stringstream hold; for (int i = 0; i < N; ++i) { hold << ldata; hold >> sink; } } static void write_read_multi(int N){ char sink[sinkSize]; int i; std::stringstream hold; for (i = 0; i < N; ++i) { hold << ldata << ' '; } for (i = 0; i < N; ++i) { hold >> sink; } } static const struct { const char *fname; void (*fun)(int); } ftable [] = { #define FENTRY(fun) { #fun, &fun } FENTRY (read_single), FENTRY (read_multi), FENTRY (write_single), FENTRY (write_multi), FENTRY (read_write_single), FENTRY (read_write_cycle), FENTRY (read_write_multi), FENTRY (write_read_single), FENTRY (write_read_cycle), FENTRY (write_read_multi), }; void print_usage (const char *name) { printf ("Usage: %s iterations + \n", name); printf ("\tWhere is one of:"); for (size_t i = 0; i != sizeof ftable / sizeof *ftable; ++i) { printf(" %s",ftable[i].fname); } printf ("\n"); } int main (int argc, char *argv[]) { if (argc < 2) { print_usage (argv [0]); return 0; } const int N = atoi (argv [1]); for (int i = 2; i < argc; ++i) { bool done = false; for (size_t j = 0; j != sizeof ftable / sizeof *ftable; ++j) { if (!strcmp (ftable [j].fname, argv [i])) { printf ("%s: %i iterations\n", ftable [j].fname, N); ftable [j].fun (N); done = true; break; } } if (!done) { fprintf (stderr, "unknown argument: %s\n", argv [i]); return 1; } } return 0; } --------------020307050701040902030101 Content-Type: text/plain; name="Makefile" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="Makefile" CFLAGS = -O2 -W -Wall -Wcast-qual -Winline -Wshadow -Wwrite-strings -Wno-long-long -Wcast-align LFLAGS = FILES := $(wildcard *.cpp) BDEPS := $(patsubst %.cpp,%.d,$(FILES)) BOBJS := $(patsubst %.cpp,%.o,$(FILES)) BTARGETS := $(patsubst %.o,%,$(BOBJS)) GDIR = glibc GCFLAGS = $(CFLAGS) GLFLAGS = $(LFLAGS) GDEPS := $(patsubst %,$(GDIR)/%,$(BDEPS)) GOBJS := $(patsubst %,$(GDIR)/%,$(BOBJS)) GTARGETS := $(patsubst %,$(GDIR)/%,$(BTARGETS)) STD_BUILD = STD_SRC = ADIR = stdcxx ACFLAGS = $(CFLAGS) -I$(STD_SRC)/include/ansi -D_RWSTD_USE_CONFIG -I$(STD_BUILD)/include -I$(STD_SRC)/include -nostdinc++ ALFLAGS = $(LFLAGS) -L$(STD_BUILD)/lib -lstd8d -lsupc++ ADEPS := $(patsubst %,$(ADIR)/%,$(BDEPS)) AOBJS := $(patsubst %,$(ADIR)/%,$(BOBJS)) ATARGETS := $(patsubst %,$(ADIR)/%,$(BTARGETS)) DEPS = $(ADEPS) $(GDEPS) OBJS = $(AOBJS) $(GOBJS) TARGETS = $(ATARGETS) $(GTARGETS) -include $(DEPS) #General rules dependclean: rm $(DEPS) @echo "dependencies will be regenerated at the next invocation of make" clean: rm $(OBJS) veryclean: clean dependclean rm $(TARGETS) all: $(TARGETS) #Patterns $(GDIR)/%.d: %.cpp g++ $(GCFLAGS) -M $< > $@ $(GDIR)/%.o: %.cpp g++ $(GCFLAGS) -o $@ -c $< $(GDIR)/%: $(GDIR)/%.o g++ $(GLFLAGS) -o $@ $< $(ADIR)/%.d: %.cpp g++ $(ACFLAGS) -M $< > $@ $(ADIR)/%.o: %.cpp g++ $(ACFLAGS) -o $@ -c $< $(ADIR)/%: $(ADIR)/%.o g++ $(ALFLAGS) -o $@ $< .PHONY: dependclean clean veryclean .SUFFIXES: .cpp .d .o --------------020307050701040902030101--