Mailing-List: contact dev-help@stdcxx.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@stdcxx.apache.org
Received-SPF: pass (nike.apache.org: local policy)
Message-ID: <4820A81A.70406@roguewave.com>
Date: Tue, 06 May 2008 12:48:58 -0600
From: Martin Sebor <sebor@roguewave.com>
Organization: Rogue Wave Software, Inc.
User-Agent: Thunderbird 2.0.0.12 (X11/20080226)
MIME-Version: 1.0
To: dev@stdcxx.apache.org
Subject: Re: [jira] Updated: (STDCXX-536) allow thread safety tests to time
 out without failing
References: <23457336.1188006330941.JavaMail.jira@brutus>
 <29456746.1189471109779.JavaMail.jira@brutus>
 <46E6C584.1090707@roguewave.com>
 <C8506D6098A6F545BA3745D0CE359E83018D66B7@qxvcexch01.ad.quovadx.com>
 <46E71B1E.1020809@roguewave.com>
 <C8506D6098A6F545BA3745D0CE359E83018D6765@qxvcexch01.ad.quovadx.com>
 <46E732A9.20201@roguewave.com>
 <C8506D6098A6F545BA3745D0CE359E83018D6904@qxvcexch01.ad.quovadx.com>
 <46E83E20.1040907@roguewave.com> <17071500.post@talk.nabble.com>
In-Reply-To: <17071500.post@talk.nabble.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Travis Vitek wrote:
> 
> 
> Martin Sebor wrote:
>> Travis Vitek wrote:
>>> Are you talking about modifying the nightly build infrastructure so that
>>> it will invoke a test several times with different command line
>>> arguments to run the test differently, or are you thinking about having
>>> the test invoke itself with different arguments or something? I don't
>>> think any option other than the first will work.
>> The former. I'm envisioning changing the exec utility (which runs
>> the tests) to look for some "command" file corresponding to each
>> test from which it would figure out how many times to run it and
>> with what options. If the file didn't exist exec would run the
>> test the same way it does now.
>>
> 
> This seems quite a bit more complicated than what was originally proposed.
> It involves changes to the test infrastructure in addition to modifications
> to the tests themselves, and probably the build result reporting code.

Definitely the first one. I was imagining a file similar to
xfail.txt with the name of the test(s) in the first column
and the command line options to add/remove/change in the
rest. E.g.,

22.locale.messages.mt.cpp  CXXOPTS=-foo \
                            LDFLAGS+=-lbar \
                            RUNOPTS+=--nthreads="$((NCPUS*2))"

We'd have to arrange for the NCPUS environment variable to
be defined.

> 
> 
> Martin Sebor wrote:
>> This approach could even be extended to compiling and linking
>> certain tests with special options, or to implement negative
>> testing, i.e., to exercise the ability of the library to reject
>> invalid programs.
>>
> 
> I think that being able these types of testing might be useful.
> Unfortunately I see this as being a very complicated way to do what we claim
> we want.

It wouldn't be trivial...

> 
> 
> Martin Sebor wrote:
>> It would even let us deal with the increasing
>> disk space problem by compiling and linking each test on demand
>> just before running it and then immediately deleting it to free
>> up disk space.
>>
> 
> It seems that we could already do this with the existing makefile. The
> run_all rule would just have to be modified to build the executable before
> it is run and clean it up afterward. That doesn't seem like it would be
> terribly difficult.

There also used to be (and maybe still is) the ability to create
.sh files containing the commands to compile and link each test
(and example) instead of compiling them first and have exec run
the .sh files instead of the actual programs.

> 
> I've gone back to read through the previous messages in this thread and I
> don't understand why we cannot just use a single timeout per
> rw_thread_pool() call. If a test has 3 sections (as most of them do), the
> timeout would apply to each section independently. We could set the default
> soft limit to 30 seconds or so, and then such a test would finish in
> approximately 90 seconds.
> 
> I think that is what you originally intended, but I got off track at some
> point and made everything all confusing.

I suppose we could do it that way. The only (small) problem I
see with this approach is that the more sections a test has the
less time it can spend exercising each section, and parts of
the library exercised by tests with a large number of sections
would less thoroughly than those with fewer of them. But that's
not necessarily an objection to this approach but rather an
observation.

Btw., WRT the efficiency of our thread safety tests, it seems
that specifying a timeout on the threads will have the opposite
effect than specifying the number of iterations for each of
them to perform because the tests will be using progressively
more and more CPU cycles as processors get faster. I wonder
if in addition to the number of CPUs on each server the tests
might need to take into account their clock speed. Of course,
that wouldn't be enough either because of the difference in
the number of instructions the same piece of code translates
to on various architectures. It would be nice to be able to
have a single number describing the desired efficacy of each
test independent of the platform and have the test driver or
harness compute the timeout and/or the number of iterations
based on each machine's parameters...

Martin