chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerome Boulon <jbou...@gmail.com>
Subject Re: What constitute a successful project?
Date Fri, 30 Nov 2012 09:04:17 GMT
Hi Eric,
Sorry to interfere at that point but I cannot let you using my name and
Netflix together for Chukwa.
I've designed Chukwa and I'm the main architect behind Chukwa, correct.

However Netflix is NOT running CHUKWA but HONU.
Honu is a stream based data collection that run at scale at Netflix and
other places.
I've designed Honu when I was at Netflix and Honu does not use CHUKWA code
anymore.
Honu code is a complete rewrite done by me and only me and that's the
reason why Honu scale
to more than 60 billions events/day.
People are still using the name Chukwa because it was the name I used for
my first presentation.
I've changed the name to Honu when I started the complete rewrite and you
are aware of that.

I'm the architect of both so there's some similarities but
Chukwa will never be Honu and I cannot let people think that they are.

I'll ask Kurt to update his presentation and to use the correct name: HONU
and not CHUKWA.

You can read more about Honu:
Here: http://www.slideshare.net/jboulon/hadoop-summit-2010-honu
or here:
http://www.slideshare.net/jboulon/cloud-connect-2012-big-data-netflix

Sorry Eric and next time you use my work please verify your sources or I'll
have to take
a more active role.

/Jerome Boulon
jboulon@apache.org


On Thu, Nov 29, 2012 at 10:39 PM, Eric Yang <eric818@gmail.com> wrote:

> Hi Jason,
>
> IBM is using Chukwa agent as the base of monitoring component for
> BigInsights.  The monitoring system share the same design principal, but
> has been custom built for BigInsights.  We wrote some generic adaptors to
> collect data from SNMP, JMX, and REST, which we are currently seeking
> approval from IBM to contribute back to open source.   BigInsights is IBM's
> distribution of Apache Hadoop.  We use it to monitor Hadoop and related
> technologies, and Chukwa is reliable and works well for us.
>
> Being able to have raw time series metrics and logs correlate events
> together.  Chukwa approach is definitely better than plain Ganglia and
> Nagios.  In Nagios and Ganglia combination, you only get facts after
> irreversible events have happened.  Such as jobtracker stop responding, or
> HBase region server died.  With raw data collected and analyzed, we can
> prevent irreversible events from happening.  For example, problematic job
> can be terminated before the job grow out of control.
>
> Netflix has a number of presentation talking about how they use Chukwa to
> stream data to EC2.  The most recent presentation is here:
>
> http://cdn.oreillystatic.com/en/assets/1/event/85/Netflix_s%20Evolving%20Data%20Science%20Architecture%20Presentation.pdf
>
> regards,
> Eric
>
> On Thu, Nov 29, 2012 at 5:54 AM, Dai, Jason <jason.dai@intel.com> wrote:
>
> > Eric and the team,
> >
> > First, let me provide a little background about us. We at Intel have been
> > using Chukwa for building HiTune (a Hadoop performance analyzer
> > https://github.com/intel-hadoop/hitune), and one of our key team member,
> > Jie Huang, was recently accepted as a Chukwa committer (unfortunately she
> > was out sick since late September and has not been as active in the
> Chukwa
> > community as we would like).
> >
> > IMO, a key question for the Chukwa project is on how to grow the
> > community, and I believe an active developer community is driven by
> active
> > users.  It is unclear to me at this moment who are using Chukwa in their
> > daily work, what it is being used for, and how it can play an important
> > role in its target domain. I would suggest people on the list to share
> > their usage as the first step - How are you using Chukwa? Do you think
> > Chukwa is a good solution that can attract new users for that specific
> > problem?
> >
> > As a starter, I'll share our usage:
> > 1)      We have been using Chukwa to collect and aggregate performance
> > metric from Hadoop cluster, so that our tool HiTune can analyze
> performance
> > of Hadoop applications.
> > 2)      And as we outlined in CHUKWA-665, we have a prototype that uses
> > Chukwa to collect and aggregate cluster system metrics, which powers the
> > Ganglia web frontend for cluster monitoring.
> >
> > IMHO, at this moment Flume is winning mindshare for distributed data
> > collection (e.g., ETL), and Ganglia & Nagios are the cluster monitoring
> of
> > choice; I wonder what your takes are on how Chukwa can differentiate in
> > these domains, or maybe there are some other domains Chukwa is good at.
> >
> > Thanks,
> > -Jason
> >
> > -----Original Message-----
> > From: Eric Yang [mailto:eric818@gmail.com]<mailto:[mailto:
> > eric818@gmail.com]>
> > Sent: Mon, 26 Nov 2012 03:33:12 GMT
> > Subject: What constitute a successful project?
> >
> > Hi IPMC,
> >
> > For the past two years, Chukwa has been labelled as non-active project by
> > mentors, and has been put on votes for retiring this project by mentor
> and
> > IPMC.
> > In this year's stats, Chukwa has more activities in comparison to Apache
> > Wink in both mailing list traffic and resolved jiras.  Yet Chukwa has
> been
> > voted to discontinue by mentors, but Wink is voted to graduate  by the
> same
> > mentor. Here are the number of mails showed up in dev list between Apache
> > Chukwa and Apache Wink:
> >
> > ...
> >
> >
> >
>



-- 
/Jerome

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message