chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <eric...@gmail.com>
Subject Re: Using Chukwa as monitoring tool.
Date Sun, 26 Dec 2010 20:46:19 GMT
For development, I am running on MacOSX 10.6 with only 2GB of RAM.
Chukwa can run with small memory foot print but not optimal
performance.  The recommended memory size is for production system
when you need to monitor thousands of nodes in a cluster system.
Chukwa is design with parallelism in mind.  Hence, there is a lot of
initial overhead for setup parallelism, which is not necessary if the
data size is small.

Try to figure out if these things applies to you:

- Generate more than 1TB of data per day (2000+ nodes of hadoop
cluster to produce this type of volume)
- Number of data sources saturate TCP connections need monitoring
system to do software load balancing
- Need Raw data, digested data can't support analysis use case

Use Chukwa if any of the item applies to you, otherwise Ganglia is
also a great way to monitor hadoop at smaller scale.

regards,
Eric

On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar <kumar.akshay@gmail.com> wrote:
> Thanks,
> In my setup, I can not afford ( as of now) to have a machine with 16GB
> memory.
> So that means, I can not deploy Chukwa as a monitoring solution ?  I do not
> intend to do any log analysis / collection for now - just simple OS and
> hadoop metrics.
>
> I mean, I do not understand why would one have 16GB has hard limit for
> minimal functioning too.
> I imagine it should be for a high performance system and not bare-bones
> structure. What am I missing here?
>
> -Akshay
>
> On 26 December 2010 23:38, Ariel Rabkin <asrabkin@gmail.com> wrote:
>>
>> Yes.  That 16 GB number is for the HICC server, not for the collection
>> side. And even then, it's if you have a lot of data (a whole cluster's
>> worth) living in a MySQL database with a web application serving the
>> data.
>>
>> The monitoring agent and the collector are both fairly small-footprint.
>>
>> --Ari
>>
>> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar <kumar.akshay@gmail.com>
>> wrote:
>> > Hi,
>> > Thanks for the responses. A bit late to check this one.
>> > I have one more query -
>> > In the Chukwa administration guide:
>> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html
>> > It says
>> > Chukwa can also be installed on a single node, in which case the machine
>> > must have at least 16 GB of memory.
>> >
>> > Q) For my usecase ( for monitoring system metrics) - is it safe to
>> > assume it
>> > is not going to be that big a requirement for memory?
>> >
>> > Thanks,
>> > Akshay
>> >
>> >
>> > On 17 December 2010 10:23, ZHOU Qi <zhouqi.jackson@gmail.com> wrote:
>> >>
>> >> Got it. Thanks.
>> >>
>> >> 2010/12/17 Eric Yang <eyang@yahoo-inc.com>:
>> >> > Sure, here you go.
>> >> >
>> >> > Regards,
>> >> > Eric
>> >> >
>> >> > On 12/16/10 6:21 PM, "ZHOU Qi" <zhouqi.jackson@gmail.com> wrote:
>> >> >
>> >> > Hi Eric,
>> >> >
>> >> > I read the wiki of Chukwa, but there is less information about HICC.
>> >> > From where I can get its screen-shot or demo?
>> >> >
>> >> > Thanks,
>> >> > 2010/12/17 Eric Yang <eyang@yahoo-inc.com>:
>> >> >> Hi Akshay,
>> >> >>
>> >> >> A) Yes.  You can use “add sigar.SystemMetrics SystemMetrics
>> >> >> [interval]
>> >> >> 0”
>> >> >> to
>> >> >> stream CPU state at specified interval.  For example:
>> >> >>
>> >> >> “add sigar.SystemMetrics SystemMetrics 5 0” without quotes
will
>> >> >> stream
>> >> >> CPU
>> >> >> state every 5 seconds.
>> >> >>
>> >> >> B) Chukwa has a graphing tool built in which is called HICC.  It
>> >> >> requires
>> >> >> Hbase deployed in order to use HICC.
>> >> >>
>> >> >> However, agent is still required on the client machines.
>> >> >>
>> >> >> Regards,
>> >> >> Eric
>> >> >>
>> >> >> On 12/16/10 4:34 AM, "Akshay Kumar" <kumar.akshay@gmail.com>
wrote:
>> >> >>
>> >> >> Hi,
>> >> >> I have a Hadoop installation, and I want to collect some basic
OS
>> >> >> level
>> >> >> metrics like  - cpu, memory, disk usage, and Hadoop metrics.
>> >> >>
>> >> >> I have looked into Ganglia, but it requires installing agents on
>> >> >> client
>> >> >> machines, which is what I want to avoid.
>> >> >>
>> >> >> My queries:
>> >> >> a) Is this a fair use case for using chukwa? e.g. polling client
>> >> >> machines
>> >> >> for CPU stats few times per minute?
>> >> >> b) Is it possible to integrate data collected from chukwa collectors
>> >> >> in
>> >> >> a
>> >> >> form readable by rrdtool kind of graphing tools on the server side?
>> >> >>
>> >> >> Thanks,
>> >> >> Akshay
>> >> >>
>> >> >>
>> >> >
>> >> >
>> >
>> >
>>
>>
>>
>> --
>> Ari Rabkin asrabkin@gmail.com
>> UC Berkeley Computer Science Department
>
>

Mime
View raw message