hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Bonetti <mark.bonetti.sc...@gmail.com>
Subject Which metrics to alert on (seeking expert advice)?
Date Mon, 09 Apr 2018 20:43:53 GMT
I'm building a monitoring system for Hadoop and want to set up default
alerts (threshold or anomaly) on 2-3 key metrics everyone who uses Hadoop
would typically want to alert on, but I don't yet have production-grade
experience with Hadoop.

Alert rules have to be generally useful, so can't be on metrics whose
values vary wildly based on the size of deployment.

In other words, which metrics would be most significant indicators that
something went wrong with your Hadoop cluster?

Thanks very much,

View raw message