incubator-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jan i <j...@apache.org>
Subject Re: Seeking interest and a champion for bifroest - a backend for graphite-web, on Apache Cassandra
Date Tue, 07 Oct 2014 10:07:06 GMT
Hi

Bifroest sounds like a very interesting project and within my field of
experience. I have worked about 3/4 year to implement circonus in ASF (It
was then decided for good reasons not to use it for alerting), and before
that I designed SCADA systems to monitor/control electrical grids.

today I live in southern spain (I am danish) so TZ fits nicely.

I volunteer to champion for you, if the project want it, but suggest we
exchange some mails offlist to checkout your wishes and my possibilities.

rgds
jan I.




On 7 October 2014 10:59, Harald Kraemer <hkraemer@goodgamestudios.com>
wrote:

> Hi,
>
> we have been allowed to open-source one of our company internal projects -
> currently called Bifroest.  Bifroest is a storage backend for graphite-web,
> based on Apache Cassandra. I'm quite happy about this, and now I'm in the
> process of finding the best options and means to do so. This mail isn't an
> entire proposal yet, but I will try to stick at least to the major points
> in a proposal.
>
> What does Bifroest do, and where does it come from.
>
> At GoodgameStudios, we used Munin for most of our monitoring, using a lot
> of custom plugins for our servers and pushing 500 - 700 hosts around.
> That's ambitious with munin and by now, the munin-master is not able to
> take the stress anymore.
> As such, we started to evaluate graphite, since graphite is the state of
> the art larger scale monitoring solution. To start evaluating graphite, we
> deployed graphite with a carbon backend on a virtual machine. Our senior
> monitoring admin (which we didn't have back then) probably just had to
> giggle a bit and doesn't know why - things didn't perform that well on a
> virtual machine. It could handle the important data, but the system didn't
> seem to scale that well.
> An admin would have tossed hardware at this, SSD-Raids and all that,
> naturally. But we are  software engineers, not admins, thus we tossed
> software at it (until we required hardware) :)
>
> Our intention was to have a graphite with data stored in a distributed
> database. A distributed database would scale both in storage space and in
> load the system can deal with. And it's all  behind a well-defined
> interface. That seemed like a nifty feature for a scalable monitoring
> system.
> Hence, we tried Cyanide, since Cyanide was just that. Tossed a lot of data
> into Apache Cassandra, click on the metric tree and... well. Nothing
> happened, since Cyanide figured that a "select *" across several 100k rows
> is a grand idea. After that, we looked at InfluxDB,  but at the time we
> started developing this, InfluxDB didn't support data aggregation and
> seemed to be in a very, very early stage of development.
>
> Thus, the first thought of bifroest was born: Why don't we take the good
> parts of Cyanide, a solid distributed database, such as Apache Cassandra,
> and the good parts of carbon and toss them in a big stew?
>
> That's what we did, and that's what we are currently deploying as our
> productive monitoring system, graphite on bifroest as a frontend for apache
> cassandra.
>
> Fun features of this system include:
>  - Existing graphite and most carbon apis:
>  -- Full support of the graphite rest API, since we are just a backend.
>  -- Support for the Plaintext Protocol of Carbon
>  -- Planned: An AMQP interface to handle globally distributed networks
>  - Neat things, which graphite could do as well:
>  -- A fast key cache
>  -- A fast value-cache, which is fed by the data collection to hit the
> database as little as possible
>  - New things, Graphite+carbon+whisper cannot do:
>  -- On the fly adjustable retention levels. You don't have the space to
> keep 6 weeks of 1m data? Just reduce it. Or increase it. Our system can do
> that on the fly.
>  -- Currently in progress: On the fly addition of new retention levels.
> Have an emergency and need data in greater resolution? Just add a retention
> level with 1 datapoint / 5s, keep the full data history and tell your data
> collection to collect more data and delete it later on again wiithout
> losing data.
>  -- High fault tolerance. We are relying on cassandra for persistent
> storage, and a properly deployed cassandra cluster with redundancy just
> doesn't care. Add a new machine, tell everything to rebuild the cluster and
> the frontend didn't even notice the outage.
>
> So, after this wall of text, there are two questions from me:
>
> a) is this project interesting enough for everyone? :)
> b) Are there people who would volunteer to coach me and my team through the
> proposal and the incubator?
>
> Regards,
> Harald.
> --
>
> *Harald Krämer*
> Server Developer (Profiling first)
> *hkraemer@goodgamestudios.com <hkraemer@goodgamestudios.com>*
>
> Goodgame Studios
> Theodorstr. 42-90, House 9
> 22761 Hamburg, Germany
> Phone: +49 (0)40 219 880 -0
> *www.goodgamestudios.com <http://www.goodgamestudios.com>*
>
> Goodgame Studios is a branch of Altigi GmbH
> Altigi GmbH, District court Hamburg, HRB 99869
> Board of directors: Dr. Kai Wawrzinek, Dr. Christian Wawrzinek, Fabian
> Ritter
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message