Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DABB1992B for ; Mon, 6 Feb 2012 19:09:56 +0000 (UTC) Received: (qmail 42701 invoked by uid 500); 6 Feb 2012 19:09:53 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 42639 invoked by uid 500); 6 Feb 2012 19:09:52 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 42631 invoked by uid 99); 6 Feb 2012 19:09:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Feb 2012 19:09:52 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of efkarr@gmail.com designates 209.85.160.176 as permitted sender) Received: from [209.85.160.176] (HELO mail-gy0-f176.google.com) (209.85.160.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Feb 2012 19:09:46 +0000 Received: by ghbz2 with SMTP id z2so3484839ghb.35 for ; Mon, 06 Feb 2012 11:09:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=4ZrDlZaD8pFZYGwUPxLioK5rTY27Cjvs+8a9SS/f+k4=; b=DD4A9ZysO3Y+wgZ3pWcBVXGyEToATJE7FR4mqphrxpB713EAhxsfxaAzRT0JhVbS+1 ZaY68omBSZmbQpU/7bwYxlDb9Uh4JB2ms4ZKWygax2IcIBaAj/wyY/YHNL9Q0KyXuTnU 9hZlSLrf4gIQkjMhAxrE7zjwpNryO9dH+7xcY= MIME-Version: 1.0 Received: by 10.50.202.97 with SMTP id kh1mr22743160igc.19.1328555364364; Mon, 06 Feb 2012 11:09:24 -0800 (PST) Received: by 10.231.187.157 with HTTP; Mon, 6 Feb 2012 11:09:24 -0800 (PST) In-Reply-To: References: Date: Mon, 6 Feb 2012 21:09:24 +0200 Message-ID: Subject: Re: Does Hadoop 0.20.205 and Ganglia 3.1.7 compatible with each other ? From: mete To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=f46d044794f98eb22504b8506661 X-Virus-Checked: Checked by ClamAV on apache.org --f46d044794f98eb22504b8506661 Content-Type: text/plain; charset=ISO-8859-1 Hello, i also face this issue when using GangliaContext31 and hadoop-1.0.0, and ganglia 3.1.7 (also tried 3.1.2). I continuously get buffer overflows as soon as i restart the gmetad. Regards Mete On Mon, Feb 6, 2012 at 7:42 PM, Vitthal "Suhas" Gogate < gogate@hortonworks.com> wrote: > I assume you have seen the following information on Hadoop twiki, > http://wiki.apache.org/hadoop/GangliaMetrics > > So do you use GangliaContext31 in hadoop-metrics2.properties? > > We use Ganglia 3.2 with Hadoop 20.205 and works fine (I remember seeing > gmetad sometime goes down due to buffer overflow problem when hadoop starts > pumping in the metrics.. but restarting works.. let me know if you face > same problem? > > --Suhas > > Additionally, the Ganglia protocol change significantly between Ganglia 3.0 > and Ganglia 3.1 (i.e., Ganglia 3.1 is not compatible with Ganglia 3.0 > clients). This caused Hadoop to not work with Ganglia 3.1; there is a patch > available for this, HADOOP-4675. As of November 2010, this patch has been > rolled into the mainline for 0.20.2 and later. To use the Ganglia 3.1 > protocol in place of the 3.0, substitute > org.apache.hadoop.metrics.ganglia.GangliaContext31 for > org.apache.hadoop.metrics.ganglia.GangliaContext in the > hadoop-metrics.properties lines above. > > On Fri, Feb 3, 2012 at 1:07 PM, Merto Mertek wrote: > > > I spent a lot of time to figure it out however i did not find a solution. > > Problems from the logs pointed me for some bugs in rrdupdate tool, > however > > i tried to solve it with different versions of ganglia and rrdtool but > the > > error is the same. Segmentation fault appears after the following lines, > if > > I run gmetad in debug mode... > > > > "Created rrd > > > > > /var/lib/ganglia/rrds/hdcluster/xxx/metricssystem.MetricsSystem.publish_max_time.rrd" > > "Created rrd > > > > > /var/lib/ganglia/rrds/hdcluster/xxx/metricssystem.MetricsSystem.snapshot_max_time.rrd > > " > > > > which I suppose are generated from MetricsSystemImpl.java (Is there any > way > > just to disable this two metrics?) > > > > From the /var/log/messages there are a lot of errors: > > > > "xxx gmetad[15217]: RRD_update > > > > > (/var/lib/ganglia/rrds/hdc/xxx/metricssystem.MetricsSystem.publish_imax_time.rrd): > > converting '4.9E-324' to float: Numerical result out of range" > > "xxx gmetad[15217]: RRD_update > > > > > (/var/lib/ganglia/rrds/hdc/xxx/metricssystem.MetricsSystem.snapshot_imax_time.rrd): > > converting '4.9E-324' to float: Numerical result out of range" > > > > so probably there are some converting issues ? Where should I look for > the > > solution? Would you rather suggest to use ganglia 3.0.x with the old > > protocol and leave the version >3.1 for further releases? > > > > any help is realy appreciated... > > > > On 1 February 2012 04:04, Merto Mertek wrote: > > > > > I would be glad to hear that too.. I've setup the following: > > > > > > Hadoop 0.20.205 > > > Ganglia Front 3.1.7 > > > Ganglia Back *(gmetad)* 3.1.7 > > > RRDTool 1.4.5. -> i had some troubles > > > installing 1.4.4 > > > > > > Ganglia works just in case hadoop is not running, so metrics are not > > > publshed to gmetad node (conf with new hadoop-metrics2.proprieties). > When > > > hadoop is started, a segmentation fault appears in gmetad deamon: > > > > > > sudo gmetad -d 2 > > > ....... > > > Updating host xxx, metric dfs.FSNamesystem.BlocksTotal > > > Updating host xxx, metric bytes_in > > > Updating host xxx, metric bytes_out > > > Updating host xxx, metric metricssystem.MetricsSystem.publish_max_time > > > Created rrd > > > > > > /var/lib/ganglia/rrds/hdcluster/hadoopmaster/metricssystem.MetricsSystem.publish_max_time.rrd > > > Segmentation fault > > > > > > And some info from the apache log .. > > > > > > Can someone suggest a ganglia version that is tested with hadoop > > 0.20.205? > > > I will try to sort it out however it seems a not so tribial problem.. > > > > > > Thank you > > > > > > > > > > > > > > > > > > On 2 December 2011 12:32, praveenesh kumar > wrote: > > > > > >> or Do I have to apply some hadoop patch for this ? > > >> > > >> Thanks, > > >> Praveenesh > > >> > > > > > > > > > --f46d044794f98eb22504b8506661--