incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akshay Kumar <kumar.aks...@gmail.com>
Subject Re: Using Chukwa as monitoring tool.
Date Thu, 30 Dec 2010 10:25:21 GMT
Thanks so much Eric.
I will take some time to grasp all this and try out stuff. Will definitely
get back as and when I have some feedback to give.
Regards,
Akshay

On 29 December 2010 09:03, Eric Yang <eric818@gmail.com> wrote:

> Hi Akshay,
>
> In both options, data down sampling is required.  RRDTools is doing
> data down sampling when the data is written to the RRD files.  Chukwa
> 0.4 uses mysql for data down sampling.  The graph is then rendered
> using flot (http://code.google.com/p/flot/) graphing library to serve
> the data.  There was also a prototype to render graph on the server
> side with jfreechart.  However, there was no clear interface to expose
> graph-able data.
>
> In Chukwa 0.5, we are decoupling the data with the graph library.
> There is a REST API interface to get metrics data. (See
> https://issues.apache.org/jira/browse/CHUKWA-520) However, Chukwa 0.5
> is still under development, the data down sampling has shifted from
> sql statements into mapreduce/pig-latin script.  I have not determine
> what will be in the final framework.  It is most likely to use Oozie
> as workflow scheduling engine to run mapreduce/pipg-latin jobs to
> provide down sampling and aggregation framework.
>
> You are welcome to try out code from trunk (0.5).  The current
> limitation is to avoid using a large time range and there is no
> aggregation.  Hope this helps.
>
> regards,
> Eric
>
> On Tue, Dec 28, 2010 at 1:18 PM, Akshay Kumar <kumar.akshay@gmail.com>
> wrote:
> > Hi,
> > I have GWT as the front-end, where I want to embed this information in
> one
> > of the following ways:
> > a) Simply embed RRDtool kind of generated images. That means, I will have
> to
> > run rrdtool ( I am looking at rrd4j) on server side and convert the data
> to
> > RRD format on agent/server side.
> > b) Use some graphing library - like http://dygraphs.com/.
> > I am not expecting too much of volume. To start with simple CPU, Memory
> and
> > hadoop metrics collected from 20 or so machines collected at a rate not
> more
> > than 10 per minute per metric.
> > Thanks,
> > Akshay
> > On 27 December 2010 00:05, Ariel Rabkin <asrabkin@gmail.com> wrote:
> >>
> >> 16 GB isn't a hard limit, just a suggestion. And that's based on the
> >> assumption that you have a big cluster and are collecting a lot of
> >> data and using the older MySQL based infrastructure.
> >>
> >>  How much memory you need depends on what volume of data you're
> >> collecting and what you're doing with it. How do you intend to store
> >> the data and how will you be visualizing it?
> >>
> >>
> >>
> >> --Ari
> >>
> >> On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar <kumar.akshay@gmail.com>
> >> wrote:
> >> > Thanks,
> >> > In my setup, I can not afford ( as of now) to have a machine with 16GB
> >> > memory.
> >> > So that means, I can not deploy Chukwa as a monitoring solution ?  I
> do
> >> > not
> >> > intend to do any log analysis / collection for now - just simple OS
> and
> >> > hadoop metrics.
> >> >
> >> > I mean, I do not understand why would one have 16GB has hard limit for
> >> > minimal functioning too.
> >> > I imagine it should be for a high performance system and not
> bare-bones
> >> > structure. What am I missing here?
> >> >
> >> > -Akshay
> >> >
> >> > On 26 December 2010 23:38, Ariel Rabkin <asrabkin@gmail.com> wrote:
> >> >>
> >> >> Yes.  That 16 GB number is for the HICC server, not for the
> collection
> >> >> side. And even then, it's if you have a lot of data (a whole
> cluster's
> >> >> worth) living in a MySQL database with a web application serving the
> >> >> data.
> >> >>
> >> >> The monitoring agent and the collector are both fairly
> small-footprint.
> >> >>
> >> >> --Ari
> >> >>
> >> >> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar <
> kumar.akshay@gmail.com>
> >> >> wrote:
> >> >> > Hi,
> >> >> > Thanks for the responses. A bit late to check this one.
> >> >> > I have one more query -
> >> >> > In the Chukwa administration guide:
> >> >> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html
> >> >> > It says
> >> >> > Chukwa can also be installed on a single node, in which case the
> >> >> > machine
> >> >> > must have at least 16 GB of memory.
> >> >> >
> >> >> > Q) For my usecase ( for monitoring system metrics) - is it safe
to
> >> >> > assume it
> >> >> > is not going to be that big a requirement for memory?
> >> >> >
> >> >> > Thanks,
> >> >> > Akshay
> >> >> >
> >> >> >
> >> >> > On 17 December 2010 10:23, ZHOU Qi <zhouqi.jackson@gmail.com>
> wrote:
> >> >> >>
> >> >> >> Got it. Thanks.
> >> >> >>
> >> >> >> 2010/12/17 Eric Yang <eyang@yahoo-inc.com>:
> >> >> >> > Sure, here you go.
> >> >> >> >
> >> >> >> > Regards,
> >> >> >> > Eric
> >> >> >> >
> >> >> >> > On 12/16/10 6:21 PM, "ZHOU Qi" <zhouqi.jackson@gmail.com>
> wrote:
> >> >> >> >
> >> >> >> > Hi Eric,
> >> >> >> >
> >> >> >> > I read the wiki of Chukwa, but there is less information
about
> >> >> >> > HICC.
> >> >> >> > From where I can get its screen-shot or demo?
> >> >> >> >
> >> >> >> > Thanks,
> >> >> >> > 2010/12/17 Eric Yang <eyang@yahoo-inc.com>:
> >> >> >> >> Hi Akshay,
> >> >> >> >>
> >> >> >> >> A) Yes.  You can use “add sigar.SystemMetrics SystemMetrics
> >> >> >> >> [interval]
> >> >> >> >> 0”
> >> >> >> >> to
> >> >> >> >> stream CPU state at specified interval.  For example:
> >> >> >> >>
> >> >> >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without
quotes will
> >> >> >> >> stream
> >> >> >> >> CPU
> >> >> >> >> state every 5 seconds.
> >> >> >> >>
> >> >> >> >> B) Chukwa has a graphing tool built in which is called
HICC.
>  It
> >> >> >> >> requires
> >> >> >> >> Hbase deployed in order to use HICC.
> >> >> >> >>
> >> >> >> >> However, agent is still required on the client machines.
> >> >> >> >>
> >> >> >> >> Regards,
> >> >> >> >> Eric
> >> >> >> >>
> >> >> >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <kumar.akshay@gmail.com>
> >> >> >> >> wrote:
> >> >> >> >>
> >> >> >> >> Hi,
> >> >> >> >> I have a Hadoop installation, and I want to collect
some basic
> OS
> >> >> >> >> level
> >> >> >> >> metrics like  - cpu, memory, disk usage, and Hadoop
metrics.
> >> >> >> >>
> >> >> >> >> I have looked into Ganglia, but it requires installing
agents
> on
> >> >> >> >> client
> >> >> >> >> machines, which is what I want to avoid.
> >> >> >> >>
> >> >> >> >> My queries:
> >> >> >> >> a) Is this a fair use case for using chukwa? e.g.
polling
> client
> >> >> >> >> machines
> >> >> >> >> for CPU stats few times per minute?
> >> >> >> >> b) Is it possible to integrate data collected from
chukwa
> >> >> >> >> collectors
> >> >> >> >> in
> >> >> >> >> a
> >> >> >> >> form readable by rrdtool kind of graphing tools on
the server
> >> >> >> >> side?
> >> >> >> >>
> >> >> >> >> Thanks,
> >> >> >> >> Akshay
> >> >> >> >>
> >> >> >> >>
> >> >> >> >
> >> >> >> >
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Ari Rabkin asrabkin@gmail.com
> >> >> UC Berkeley Computer Science Department
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Ari Rabkin asrabkin@gmail.com
> >> UC Berkeley Computer Science Department
> >
> >
>

Mime
View raw message