Return-Path: Delivered-To: apmail-incubator-chukwa-user-archive@www.apache.org Received: (qmail 34446 invoked from network); 29 Dec 2010 03:34:29 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Dec 2010 03:34:29 -0000 Received: (qmail 53025 invoked by uid 500); 29 Dec 2010 03:34:29 -0000 Delivered-To: apmail-incubator-chukwa-user-archive@incubator.apache.org Received: (qmail 52899 invoked by uid 500); 29 Dec 2010 03:34:27 -0000 Mailing-List: contact chukwa-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: chukwa-user@incubator.apache.org Delivered-To: mailing list chukwa-user@incubator.apache.org Received: (qmail 52892 invoked by uid 99); 29 Dec 2010 03:34:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Dec 2010 03:34:26 +0000 X-ASF-Spam-Status: No, hits=4.0 required=10.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,FREEMAIL_REPLY,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of eric818@gmail.com designates 209.85.214.175 as permitted sender) Received: from [209.85.214.175] (HELO mail-iw0-f175.google.com) (209.85.214.175) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Dec 2010 03:34:20 +0000 Received: by iwn8 with SMTP id 8so10212293iwn.6 for ; Tue, 28 Dec 2010 19:33:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=XP2/KFgybtipPYtggaMkbcp3bY+vTcz5ayQfVHbLsjs=; b=YYLk6+sg1FdB6xornD5diFD77tKp3IkQauWDfpfodLmOeFWHkCp3l50CGSrw8g0/i4 SF209i5fbxe8hgym7xqWINF8bjIWmZIbDTEVBWKybiyjWVhT12FvPkxPuQ2R49gBVilB WZIyFQ3v4TwDim41sVefky9jg1rD9H0bMneXM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=gkqZqqHsJp7DunOPTQ++FK+/dFKXY/EyNckNQ2ElkYlO37LCxsdm86+QawO+AclEqn YAyEkvBhJpkrIP8MQnP5INgHXMMDwHI3Bl9i4GeQ4xlXzrYBSSiJlY7xGFW/zP/pay/2 HeJWKE+RFmECJuXKsYlB6EawA8w9TbB6elcHo= MIME-Version: 1.0 Received: by 10.42.178.193 with SMTP id bn1mr14305675icb.14.1293593639359; Tue, 28 Dec 2010 19:33:59 -0800 (PST) Received: by 10.42.225.72 with HTTP; Tue, 28 Dec 2010 19:33:59 -0800 (PST) In-Reply-To: References: Date: Tue, 28 Dec 2010 19:33:59 -0800 Message-ID: Subject: Re: Using Chukwa as monitoring tool. From: Eric Yang To: chukwa-user@incubator.apache.org Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hi Akshay, In both options, data down sampling is required. RRDTools is doing data down sampling when the data is written to the RRD files. Chukwa 0.4 uses mysql for data down sampling. The graph is then rendered using flot (http://code.google.com/p/flot/) graphing library to serve the data. There was also a prototype to render graph on the server side with jfreechart. However, there was no clear interface to expose graph-able data. In Chukwa 0.5, we are decoupling the data with the graph library. There is a REST API interface to get metrics data. (See https://issues.apache.org/jira/browse/CHUKWA-520) However, Chukwa 0.5 is still under development, the data down sampling has shifted from sql statements into mapreduce/pig-latin script. I have not determine what will be in the final framework. It is most likely to use Oozie as workflow scheduling engine to run mapreduce/pipg-latin jobs to provide down sampling and aggregation framework. You are welcome to try out code from trunk (0.5). The current limitation is to avoid using a large time range and there is no aggregation. Hope this helps. regards, Eric On Tue, Dec 28, 2010 at 1:18 PM, Akshay Kumar wrot= e: > Hi, > I have GWT as the front-end, where I want to embed this information in on= e > of the following ways: > a) Simply embed RRDtool kind of generated images. That means, I will have= to > run rrdtool ( I am looking at rrd4j) on server side and convert the data = to > RRD format on agent/server side. > b) Use some graphing library - like=A0http://dygraphs.com/. > I am not expecting too much of volume. To start with simple CPU, Memory a= nd > hadoop metrics collected from 20 or so machines collected at a rate not m= ore > than 10 per minute per metric. > Thanks, > Akshay > On 27 December 2010 00:05, Ariel Rabkin wrote: >> >> 16 GB isn't a hard limit, just a suggestion. And that's based on the >> assumption that you have a big cluster and are collecting a lot of >> data and using the older MySQL based infrastructure. >> >> =A0How much memory you need depends on what volume of data you're >> collecting and what you're doing with it. How do you intend to store >> the data and how will you be visualizing it? >> >> >> >> --Ari >> >> On Sun, Dec 26, 2010 at 10:29 AM, Akshay Kumar >> wrote: >> > Thanks, >> > In my setup, I can not afford ( as of now) to have a machine with 16GB >> > memory. >> > So that means, I can not deploy Chukwa as a monitoring solution ?=A0 I= do >> > not >> > intend to do any log analysis / collection for now - just simple OS an= d >> > hadoop metrics. >> > >> > I mean, I do not understand why would one have 16GB has hard limit for >> > minimal functioning too. >> > I imagine it should be for a high performance system and not bare-bone= s >> > structure. What am I missing here? >> > >> > -Akshay >> > >> > On 26 December 2010 23:38, Ariel Rabkin wrote: >> >> >> >> Yes. =A0That 16 GB number is for the HICC server, not for the collect= ion >> >> side. And even then, it's if you have a lot of data (a whole cluster'= s >> >> worth) living in a MySQL database with a web application serving the >> >> data. >> >> >> >> The monitoring agent and the collector are both fairly small-footprin= t. >> >> >> >> --Ari >> >> >> >> On Sun, Dec 26, 2010 at 10:03 AM, Akshay Kumar >> >> wrote: >> >> > Hi, >> >> > Thanks for the responses. A bit late to check this one. >> >> > I have one more query - >> >> > In the Chukwa administration guide: >> >> > http://people.apache.org/~eyang/docs/r0.1.2/admin.html >> >> > It says >> >> > Chukwa can also be installed on a single node, in which case the >> >> > machine >> >> > must have at least 16 GB of memory. >> >> > >> >> > Q) For my usecase ( for monitoring system metrics) - is it safe to >> >> > assume it >> >> > is not going to be that big a requirement for memory? >> >> > >> >> > Thanks, >> >> > Akshay >> >> > >> >> > >> >> > On 17 December 2010 10:23, ZHOU Qi wrote= : >> >> >> >> >> >> Got it. Thanks. >> >> >> >> >> >> 2010/12/17 Eric Yang : >> >> >> > Sure, here you go. >> >> >> > >> >> >> > Regards, >> >> >> > Eric >> >> >> > >> >> >> > On 12/16/10 6:21 PM, "ZHOU Qi" wrote: >> >> >> > >> >> >> > Hi Eric, >> >> >> > >> >> >> > I read the wiki of Chukwa, but there is less information about >> >> >> > HICC. >> >> >> > From where I can get its screen-shot or demo? >> >> >> > >> >> >> > Thanks, >> >> >> > 2010/12/17 Eric Yang : >> >> >> >> Hi Akshay, >> >> >> >> >> >> >> >> A) Yes. =A0You can use =93add sigar.SystemMetrics SystemMetrics >> >> >> >> [interval] >> >> >> >> 0=94 >> >> >> >> to >> >> >> >> stream CPU state at specified interval. =A0For example: >> >> >> >> >> >> >> >> =93add sigar.SystemMetrics SystemMetrics 5 0=94 without quotes = will >> >> >> >> stream >> >> >> >> CPU >> >> >> >> state every 5 seconds. >> >> >> >> >> >> >> >> B) Chukwa has a graphing tool built in which is called HICC. = =A0It >> >> >> >> requires >> >> >> >> Hbase deployed in order to use HICC. >> >> >> >> >> >> >> >> However, agent is still required on the client machines. >> >> >> >> >> >> >> >> Regards, >> >> >> >> Eric >> >> >> >> >> >> >> >> On 12/16/10 4:34 AM, "Akshay Kumar" >> >> >> >> wrote: >> >> >> >> >> >> >> >> Hi, >> >> >> >> I have a Hadoop installation, and I want to collect some basic = OS >> >> >> >> level >> >> >> >> metrics like=A0 - cpu, memory, disk usage, and Hadoop metrics. >> >> >> >> >> >> >> >> I have looked into Ganglia, but it requires installing agents o= n >> >> >> >> client >> >> >> >> machines, which is what I want to avoid. >> >> >> >> >> >> >> >> My queries: >> >> >> >> a) Is this a fair use case for using chukwa? e.g. polling clien= t >> >> >> >> machines >> >> >> >> for CPU stats few times per minute? >> >> >> >> b) Is it possible to integrate data collected from chukwa >> >> >> >> collectors >> >> >> >> in >> >> >> >> a >> >> >> >> form readable by rrdtool kind of graphing tools on the server >> >> >> >> side? >> >> >> >> >> >> >> >> Thanks, >> >> >> >> Akshay >> >> >> >> >> >> >> >> >> >> >> > >> >> >> > >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Ari Rabkin asrabkin@gmail.com >> >> UC Berkeley Computer Science Department >> > >> > >> >> >> >> -- >> Ari Rabkin asrabkin@gmail.com >> UC Berkeley Computer Science Department > >