I was wondering about how do you guys handle a large cluster (50+ machines).

I mean there is sometime you need to change configuration (cassandra.yaml) or send a command to one, some or all nodes (cleanup, upgradesstables, setstramthoughput or whatever).

So far we have been using things like custom scripts for repairs or any routine maintenance and cssh for specific and one shot actions on the cluster. But I guess this doesn't really scale, I guess we coul use pssh instead. For configuration changes we use Capistrano that might scale properly.

So I would like to known, what are the methods that operators use on large cluster out there ? Have some of you built some open sourced "cluster management" interfaces or scripts that could make things easier while operating on large Cassandra clusters ?