ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@apache.org>
Subject Ignite logs adoption for enterprise grade monitoring tools
Date Wed, 10 Jan 2018 00:49:53 GMT

As a preface, Alexey Kukushkin laid out an insightful and profound explanation on what’s
wrong with Ignite logs from a DevOps perspective, how the community can easily tackle the
gaps and how our efforts will be payed off if we take his advice in consideration: 

In short, Ignite log events (errors, warnings and non-severe messages) are not assigned unique
Why a mature project like Ignite needs it?

First, to have a human-friendly glossary of error messages or warnings (see MySQL [1] and
MongoDB [2] examples) that simplify troubleshooting and debugging on the dev side. Actually
we planned to do it back in 2016! [3]

Second, turns out to be that popular DevOps monitoring tools such as DynaTrace [4] and Nagios
[5] can easily analyze IDs of log events and help automate their processing or trigger notifications.
For instance, if “node left” log message was labeled with an ID then DynaTrace could detect
that event and by looking at overall memory usage (JMX) decide what to do next - just send
an email to an admin or add a new node to the cluster.

My proposal is to start putting the glossary together making Ignite ready for enterprise grade
monitoring systems and DevOps! 

As a first step, let’s define subsystems of Ignite spreading out IDs ranges among them:
- networking (discovery, communication) - 1000 - 3000
- memory and persistence - 4000 - 6000
- key-value, caching - 7000 - 9000
- SQL - 10000 - 11000
- etc.

Is everyone with this format and overall endeavor? 

[1] https://dev.mysql.com/doc/refman/5.5/en/error-messages-server.html
[2] https://github.com/mongodb/mongo/blob/master/src/mongo/base/error_codes.err
[3] https://issues.apache.org/jira/browse/IGNITE-3690
[4] https://www.dynatrace.com/capabilities/log-analytics/
[5] https://www.nagios.com/solutions/log-monitoring/
View raw message