hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mete <efk...@gmail.com>
Subject Re: Does Hadoop 0.20.205 and Ganglia 3.1.7 compatible with each other ?
Date Mon, 06 Feb 2012 19:09:24 GMT
Hello,
i also face this issue when using GangliaContext31 and hadoop-1.0.0, and
ganglia 3.1.7 (also tried 3.1.2). I continuously get buffer overflows as
soon as i restart the gmetad.
Regards
Mete

On Mon, Feb 6, 2012 at 7:42 PM, Vitthal "Suhas" Gogate <
gogate@hortonworks.com> wrote:

> I assume you have seen the following information on Hadoop twiki,
> http://wiki.apache.org/hadoop/GangliaMetrics
>
> So do you use GangliaContext31 in hadoop-metrics2.properties?
>
> We use Ganglia 3.2 with Hadoop 20.205  and works fine (I remember seeing
> gmetad sometime goes down due to buffer overflow problem when hadoop starts
> pumping in the metrics.. but restarting works.. let me know if you face
> same problem?
>
> --Suhas
>
> Additionally, the Ganglia protocol change significantly between Ganglia 3.0
> and Ganglia 3.1 (i.e., Ganglia 3.1 is not compatible with Ganglia 3.0
> clients). This caused Hadoop to not work with Ganglia 3.1; there is a patch
> available for this, HADOOP-4675. As of November 2010, this patch has been
> rolled into the mainline for 0.20.2 and later. To use the Ganglia 3.1
> protocol in place of the 3.0, substitute
> org.apache.hadoop.metrics.ganglia.GangliaContext31 for
> org.apache.hadoop.metrics.ganglia.GangliaContext in the
> hadoop-metrics.properties lines above.
>
> On Fri, Feb 3, 2012 at 1:07 PM, Merto Mertek <masmertoz@gmail.com> wrote:
>
> > I spent a lot of time to figure it out however i did not find a solution.
> > Problems from the logs pointed me for some bugs in rrdupdate tool,
> however
> > i tried to solve it with different versions of ganglia and rrdtool but
> the
> > error is the same. Segmentation fault appears after the following lines,
> if
> > I run gmetad in debug mode...
> >
> > "Created rrd
> >
> >
> /var/lib/ganglia/rrds/hdcluster/xxx/metricssystem.MetricsSystem.publish_max_time.rrd"
> > "Created rrd
> >
> >
> /var/lib/ganglia/rrds/hdcluster/xxx/metricssystem.MetricsSystem.snapshot_max_time.rrd
> > "
> >
> > which I suppose are generated from MetricsSystemImpl.java (Is there any
> way
> > just to disable this two metrics?)
> >
> > From the /var/log/messages there are a lot of errors:
> >
> > "xxx gmetad[15217]: RRD_update
> >
> >
> (/var/lib/ganglia/rrds/hdc/xxx/metricssystem.MetricsSystem.publish_imax_time.rrd):
> > converting  '4.9E-324' to float: Numerical result out of range"
> > "xxx gmetad[15217]: RRD_update
> >
> >
> (/var/lib/ganglia/rrds/hdc/xxx/metricssystem.MetricsSystem.snapshot_imax_time.rrd):
> > converting  '4.9E-324' to float: Numerical result out of range"
> >
> > so probably there are some converting issues ? Where should I look for
> the
> > solution? Would you rather suggest to use ganglia 3.0.x with the old
> > protocol and leave the version >3.1 for further releases?
> >
> > any help is realy appreciated...
> >
> > On 1 February 2012 04:04, Merto Mertek <masmertoz@gmail.com> wrote:
> >
> > > I would be glad to hear that too.. I've setup the following:
> > >
> > > Hadoop 0.20.205
> > > Ganglia Front  3.1.7
> > > Ganglia Back *(gmetad)* 3.1.7
> > > RRDTool <http://www.rrdtool.org/> 1.4.5. -> i had some troubles
> > > installing 1.4.4
> > >
> > > Ganglia works just in case hadoop is not running, so metrics are not
> > > publshed to gmetad node (conf with new hadoop-metrics2.proprieties).
> When
> > > hadoop is started, a segmentation fault appears in gmetad deamon:
> > >
> > > sudo gmetad -d 2
> > > .......
> > > Updating host xxx, metric dfs.FSNamesystem.BlocksTotal
> > > Updating host xxx, metric bytes_in
> > > Updating host xxx, metric bytes_out
> > > Updating host xxx, metric metricssystem.MetricsSystem.publish_max_time
> > > Created rrd
> > >
> >
> /var/lib/ganglia/rrds/hdcluster/hadoopmaster/metricssystem.MetricsSystem.publish_max_time.rrd
> > > Segmentation fault
> > >
> > > And some info from the apache log <http://pastebin.com/nrqKRtKJ>..
> > >
> > > Can someone suggest a ganglia version that is tested with hadoop
> > 0.20.205?
> > > I will try to sort it out however it seems a not so tribial problem..
> > >
> > > Thank you
> > >
> > >
> > >
> > >
> > >
> > > On 2 December 2011 12:32, praveenesh kumar <praveenesh@gmail.com>
> wrote:
> > >
> > >> or Do I have to apply some hadoop patch for this ?
> > >>
> > >> Thanks,
> > >> Praveenesh
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message