brooklyn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Heneveld <>
Subject Re: brooklyn server maintenance mode
Date Wed, 30 Jul 2014 14:00:26 GMT


Not yet sure what the best way to support HA state fixes is.  In first 
instance probably just displaying the failed items and having options to 
"rebind ignoring failures" and "try rebind again now" (where fixes are 
made out-of-band, directly in persistent store [1]).


[1]  CyberDuck is a great tool for working with object stores.  It 
doesn't allow in-place viewing or editing but it does easily let you do 
bulk transfer.  Note however for Softlayer there is a bug in latest 
v4.5, so use v4.4.5.

On 30/07/2014 09:29, Aled Sage wrote:
> +1; all good suggestions.
> For "HA state could be edited...", are you thinking that the rebind 
> would pause at the error, allowing the state to be modified and rebind 
> continue? Or more that one could look at the task errors, then 
> download+fix the entity state, and then run rebind from the start again?
> Aled
> On 30/07/2014 05:45, Alex Heneveld wrote:
>> Hi folks-
>> As many of you know, when running Brooklyn if rebind fails the server 
>> responds safety-first by failing or declining to start. You then 
>> trawl through the logs, investigate the persisted state, resolve the 
>> issues, and restart it.  Ideally this would be visible and resolvable 
>> within the server itself.  To this end I'm thinking of:
>> * a new "maintenance mode" that the server would run in if there are 
>> problems in startup or failover
>> * when in maintenance mode (or even HA standby mode), you are 
>> presented with a warning in the GUI but you can set an http session 
>> flag to allow access (if you have the entitlement)
>> * a new "server" tab where server-level tasks are tracked (and other 
>> server operations such as shutdown, force failover, etc, could be 
>> sensibly re-housed)
>> * all startup activities and HA activities are run as server-level 
>> tasks and visible in the server tab (which would allow you to see the 
>> reason for HA failures)
>> * HA state could be edited in the server tab, when a server is in 
>> maintenance mode, to resolve problems and drive rebind
>> Is this a good evolution?
>> Best
>> Alex

View raw message