river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregg Wonderly <gregg...@gmail.com>
Subject Re: A new implementation of TaskManager
Date Wed, 07 Jul 2010 02:26:39 GMT

On Jul 6, 2010, at 6:03 PM, Patricia Shanahan wrote:

> Gregg Wonderly wrote:
>> Patricia Shanahan wrote:
> ...
>>> If one of the River developers has an intranet test environment it may be possible
to simulate the effect of running over the Internet by a similar trick. Create some workload
that keeps the network very busy, and run it in parallel with a quality assurance test.
>>> In some cases it may not matter which of two transactions is done first, but
it is important to make sure there is a consistent order between them.
>> More recently, one of my most favorite test environments is to bring up open solaris
on an i7 processor based machine with some reasonable amount of memory (8GB or more) and then
put 8 or more instances of linux on it all running the same build, and then test there with
appropriate loading.  You'll get latency injection because of machine resource contention,
but you'll also get 8, independent OS and Java VM layers that will be readily able to provide
just about any unexplainable behavior you need to test with :-)
> Sounds nice and chaotic. When I have a new TaskManager and related changes working on
my system, I'll ask you to take it for a spin.
> One problem I don't think that would reproduce is the ambiguity between a transaction
taking a very long time because of load, and a transaction that is not going to complete because
a server that was working on it has crashed. That issue always gives me headaches.

I do deal with this issue as well.  It would be nice if there were a more "instant" way for
transaction participants to be indicated as "lost" to cancel hung transactions more readily.

I have an application that has more than 20 participants on 6 servers and if one of those
doesn't want to play, it can take a while to discover just who the problem participant is
for debugging etc.

Gregg Wonderly
View raw message