www-infrastructure-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christofer Dutz <christofer.d...@c-ware.de>
Subject Tool proposal for helping run and monitor the ASF Infra Services
Date Thu, 18 Aug 2016 09:13:56 GMT

I have been on the Infra Hipchat for a few weeks now while trying to migrate the Flex project
to Maven and back to the ASF Infra build system. Thanks for your support in this and even
more thanks for the trust in granting me access and Admin rights on the windows1 build agent.

In the chat I observed the daily work of you guys, having to maintain quite a zoo of all sorts
of different systems on different platforms. Some problems you were having seem quite easy
to track down ... if the hard disk is full, you clean up. But not all problems are that easy
to track down. Thinking of the problems with repository.apache.org ... here the cause was
the proxy being flooded with connections (I think this was the case) ... regular restarts
of this helped temporarily, but I don't think that helps on the long term as no one had an
idea why those connections were hanging there in the first place.

A few years ago the company I work for - codecentric - have founded a company called Instana.
They are developing an agent based system for monitoring IT infrastructure. In contrast to
most established solutions, they use machine learning strategies to analyze the root cause
for problems. While you can probably achieve similar results with normal tools, the problem
is that you need a very detailed domain knowledge to do so and in a regularly changing environment
you need to continuously keep adjusting your metrics. Instana does this automatically. I think
you can imagine how tricky it is to follow the root cause for bad response times through a
network of interconnected services.

Investing almost all of my free time (and a lot of my paid time) for Apache, noticing a lot
of the problems you have to deal with every day, I asked Instana if they would be willing
to provide their service to the ASF for free and they agreed and immediately setup a dedicated

I wanted to try the thing out as I would prefer to grab a few beers with you at ApcheCon in
Cevillia and not get punched in the face for recommending something bad ;-) ... so I tried
this on my private Server playground. I unpacked and started the agent and the host appeared
on the web console and reported the problems it was having (ones I didn't even know about)
as well as other systems it communicates with ... as soon as I added agents on these machines
the analytics started doing their work across system and I built up a map view of my services
and their correlation. So it's really a system that needs almost no configuration at all :-)

I uploaded the internal product presentation here: https://public.centerdevice.de/1a9dc4ed-515e-482e-9fd6-6d60a5562598
(please don't share this outside of the ASF)

Please use the password: 4p4cheR0cks (I'll remove that document in about two weeks)

By the way ... the screenshots in the presentation are real ... I was amazed of seeing a 3D
web UI in production for the first time ;-)

So if there is any interest in this offer, I would be more than happy to provide credentials
to you and assist you in getting started, so you could easily try it out. The guys at Instana
would also be delighted to give you guys an online demo and answer any questions you might
be having. Feel free to conatact Mirco directly for this: mirko.novakovic@codecentric.de


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message