continuum-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wendy Smoak <wsm...@gmail.com>
Subject What should happen when a distributed agent dies?
Date Mon, 28 Sep 2009 17:03:47 GMT
I've been working with Distributed Builds lately, and I've found that
it works if everything is perfect, but if something goes wrong it has
a hard time coping with the problem, and it doesn't recover.

For example, it's a given that at some point, an agent is going to die
without being properly removed first.

Currently if this happens, the Queues page breaks (error/stack trace)
and you can't edit or delete the offending agent to disable or get rid
of it.

The agent is also still shown as 'enabled' on the Distributed Agents
page even though it's not responding.

What should happen in this case?

I'm all for having the system automatically disable any agent that is
not behaving properly.  At first, the admin may have to manually
re-enable it.  In the future we might come up with a way for it to
auto-recover.

Thoughts?

-- 
Wendy

Mime
View raw message