hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Massie <m...@cloudera.com>
Subject Re: Using Ganglia with hadoop 0.19.0 on Amazon EC2
Date Fri, 11 Sep 2009 00:30:42 GMT
Samprita-

I'm assuming at this point that you have gmond installed on all nodes in
your cluster.  Correct me if I'm assuming too much.

The next step is to configure gmond.  See the man page gmond.conf.

# man gmond.conf

In particular, you are going to need to change the udp_send_channel.  By
default, ganglia will use multicast to share metrics with other gmond in a
cluster and that won't work on ec2.

For example, let's say that you pick the gmond running on node (e.g.
ip-10-10-10-10.ec2.internal) to listen to every gmond in your cluster.  You
will need to update the gmond.conf on ip-10-10-10-10.ec2.internal to have
the following in gmond.conf...

udp_recv_channel {
           port = 8666
           family = inet4
}

.... which will tell that gmond to listen on port UDP 8666.  You then need
to tell all the other nodes on ec2 to send their metrics to
ip-10-10-10-10.ec2.internal by adding the following to their configuration
file...

udp_send_channel {
           host = ip-10-10-10-10.ec2.internal
           port = 8666
}

You will need to restart gmond for the configuration to take effect.

# /etc/init.d/gmond restart

If the configuration worked, you should be able to log into
ip-10-10-10-10.ec2.internal and receive in XML response for all nodes in
your ec2 cluster... e.g....

$ telnet localhost 8649 | grep "<HOST"

should output a list of hosts in your cluster.  Once you have a gmond
listening to the cluster, I'll help you get gmetad installed with the web
console.

-Matt




On Wed, Sep 9, 2009 at 3:13 PM, Samprita Hegde <sampritavh@gmail.com> wrote:

>   I ran a two node cluster.When I typed in the command " telnet localhost
> 8649 " I  got the following XML string :
> [root@domU-12-31-39-07-75-F4 ~]# telnet localhost 8649
> Trying 127.0.0.1...
> Connected to localhost.
> Escape character is '^]'.
> <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
> <!DOCTYPE GANGLIA_XML [
>   <!ELEMENT GANGLIA_XML (GRID|CLUSTER|HOST)*>
>      <!ATTLIST GANGLIA_XML VERSION CDATA #REQUIRED>
>      <!ATTLIST GANGLIA_XML SOURCE CDATA #REQUIRED>
>   <!ELEMENT GRID (CLUSTER | GRID | HOSTS | METRICS)*>
>      <!ATTLIST GRID NAME CDATA #REQUIRED>
>      <!ATTLIST GRID AUTHORITY CDATA #REQUIRED>
>      <!ATTLIST GRID LOCALTIME CDATA #IMPLIED>
>   <!ELEMENT CLUSTER (HOST | HOSTS | METRICS)*>
>      <!ATTLIST CLUSTER NAME CDATA #REQUIRED>
>      <!ATTLIST CLUSTER OWNER CDATA #IMPLIED>
>      <!ATTLIST CLUSTER LATLONG CDATA #IMPLIED>
>      <!ATTLIST CLUSTER URL CDATA #IMPLIED>
>      <!ATTLIST CLUSTER LOCALTIME CDATA #REQUIRED>
>   <!ELEMENT HOST (METRIC)*>
>      <!ATTLIST HOST NAME CDATA #REQUIRED>
>      <!ATTLIST HOST IP CDATA #REQUIRED>
>      <!ATTLIST HOST LOCATION CDATA #IMPLIED>
>      <!ATTLIST HOST REPORTED CDATA #REQUIRED>
>      <!ATTLIST HOST TN CDATA #IMPLIED>
>      <!ATTLIST HOST TMAX CDATA #IMPLIED>
>      <!ATTLIST HOST DMAX CDATA #IMPLIED>
>      <!ATTLIST HOST GMOND_STARTED CDATA #IMPLIED>
>   <!ELEMENT METRIC EMPTY>
>      <!ATTLIST METRIC NAME CDATA #REQUIRED>
>      <!ATTLIST METRIC VAL CDATA #REQUIRED>
>      <!ATTLIST METRIC TYPE (string | int8 | uint8 | int16 | uint16 | int32
> | uint32 | float | double | timestamp) #REQUIRED>
>      <!ATTLIST METRIC UNITS CDATA #IMPLIED>
>      <!ATTLIST METRIC TN CDATA #IMPLIED>
>      <!ATTLIST METRIC TMAX CDATA #IMPLIED>
>      <!ATTLIST METRIC DMAX CDATA #IMPLIED>
>      <!ATTLIST METRIC SLOPE (zero | positive | negative | both |
> unspecified) #IMPLIED>
>      <!ATTLIST METRIC SOURCE (gmond | gmetric) #REQUIRED>
>   <!ELEMENT HOSTS EMPTY>
>      <!ATTLIST HOSTS UP CDATA #REQUIRED>
>      <!ATTLIST HOSTS DOWN CDATA #REQUIRED>
>      <!ATTLIST HOSTS SOURCE (gmond | gmetric | gmetad) #REQUIRED>
>   <!ELEMENT METRICS EMPTY>
>      <!ATTLIST METRICS NAME CDATA #REQUIRED>
>      <!ATTLIST METRICS SUM CDATA #REQUIRED>
>      <!ATTLIST METRICS NUM CDATA #REQUIRED>
>      <!ATTLIST METRICS TYPE (string | int8 | uint8 | int16 | uint16 | int32
> | uint32 | float | double | timestamp) #REQUIRED>
>      <!ATTLIST METRICS UNITS CDATA #IMPLIED>
>      <!ATTLIST METRICS SLOPE (zero | positive | negative | both |
> unspecified) #IMPLIED>
>      <!ATTLIST METRICS SOURCE (gmond | gmetric) #REQUIRED>
> ]>
> <GANGLIA_XML VERSION="3.0.5" SOURCE="gmond">
> <CLUSTER NAME="unspecified" LOCALTIME="1252533906" OWNER="unspecified"
> LATLONG="unspecified" URL="unspecified">
> <HOST NAME="domU-12-31-39-07-75-F4.compute-1.internal" IP="10.209.118.6"
> REPORTED="1252533905" TN="0" TMAX="20" DMAX="0" LOCATION="unspecified"
> GMOND_STARTED="0">
> <METRIC NAME="threadsRunnable" VAL="6" TYPE="int32" UNITS="" TN="0"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="gcTimeMillis" VAL="207" TYPE="int32" UNITS="" TN="0"
> TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="DeleteFileOps" VAL="0" TYPE="int32" UNITS="" TN="228"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="memNonHeapUsedM" VAL="11.130989" TYPE="float" UNITS="" TN="0"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="AddBlockOps" VAL="0" TYPE="int32" UNITS="" TN="228" TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="CreateFileOps" VAL="0" TYPE="int32" UNITS="" TN="228"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="reduces_completed" VAL="0" TYPE="int32" UNITS="" TN="0"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="logFatal" VAL="0" TYPE="int32" UNITS="" TN="0" TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="jobs_completed" VAL="0" TYPE="int32" UNITS="" TN="0"
> TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="FilesCreated" VAL="4" TYPE="int32" UNITS="" TN="228"
> TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="logError" VAL="0" TYPE="int32" UNITS="" TN="0" TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="GetListingOps" VAL="0" TYPE="int32" UNITS="" TN="228"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="maps_completed" VAL="0" TYPE="int32" UNITS="" TN="0"
> TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="logInfo" VAL="31" TYPE="int32" UNITS="" TN="0" TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="reduces_launched" VAL="0" TYPE="int32" UNITS="" TN="0"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="FilesRenamed" VAL="0" TYPE="int32" UNITS="" TN="228"
> TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="memHeapCommittedM" VAL="6.5390625" TYPE="float" UNITS=""
> TN="0" TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="threadsTimedWaiting" VAL="9" TYPE="int32" UNITS="" TN="0"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="gcCount" VAL="32" TYPE="int32" UNITS="" TN="0" TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="SafemodeTime" VAL="1976" TYPE="int32" UNITS="" TN="228"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="FilesAppended" VAL="0" TYPE="int32" UNITS="" TN="228"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="maps_launched" VAL="0" TYPE="int32" UNITS="" TN="0" TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="threadsNew" VAL="0" TYPE="int32" UNITS="" TN="0" TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="GetBlockLocations" VAL="0" TYPE="int32" UNITS="" TN="228"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="threadsTerminated" VAL="0" TYPE="int32" UNITS="" TN="0"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="logWarn" VAL="0" TYPE="int32" UNITS="" TN="0" TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="memNonHeapCommittedM" VAL="18.25" TYPE="float" UNITS=""
> TN="0"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="memHeapUsedM" VAL="3.979683" TYPE="float" UNITS="" TN="0"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="threadsWaiting" VAL="15" TYPE="int32" UNITS="" TN="0"
> TMAX="60" DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="jobs_submitted" VAL="0" TYPE="int32" UNITS="" TN="0"
> TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> <METRIC NAME="threadsBlocked" VAL="0" TYPE="int32" UNITS="" TN="0"
> TMAX="60"
> DMAX="0" SLOPE="both" SOURCE="gmetric"/>
> </HOST>
> <HOST NAME="domU-12-31-39-07-84-A5.compute-1.internal" IP="10.209.135.83"
> REPORTED="1252533893" TN="13" TMAX="20" DMAX="0" LOCATION="unspecified"
> GMOND_STARTED="1252533892">
> </HOST>
> </CLUSTER>
> </GANGLIA_XML>
> Connection closed by foreign host.
>
> Thanks a lot! .  Can I get this information in my local machine browser ?
>
> Thanks and Regards,
> Sampritat
>
>
> On Wed, Sep 9, 2009 at 12:21 AM, Matt Massie <matt@cloudera.com> wrote:
>
> > Ganglia doesn't need to be patched to work.  The patches are for Hadoop
> if
> > you are running ganglia 3.1.x (because of a breaking change in the
> ganglia
> > message format from 3.0.x to 3.1.x).
> >
> > Since you are running fedora, you should be able to bring ganglia up by
> > using the ganglia RPMs which are available in the fedora repo.
> >
> > Try the following commands on each node in the cluster you want to
> monitor
> >
> > # yum install ganglia-gmond
> > # service gmond start
> >
> > You will need to open TCP/UDP port 8649 as well to allow ganglia
> > communication (see your iptables configuration).  You can verify that
> gmond
> > is working by connecting to TCP port 8649
> >
> > $ telnet localhost 8649
> >
> > You should see an XML description of the state of the cluster/node.  Let
> me
> > know when you are this far in the installation and I'll help you through
> > the
> > next steps.
> >
> > -Matt
> >
> >
> >
> > On Tue, Sep 8, 2009 at 7:49 PM, Samprita Hegde <sampritavh@gmail.com>
> > wrote:
> >
> > > The ami-id I am using is ami-fa6a8e93. It is a Fedora distribution for
> > > i-386
> > > .
> > > To be precise this is taken from the console output :
> > >
> > > Linux version 2.6.21.7-2.fc8xen (
> mockbuild@xenbuilder1.fedora.redhat.com
> > )
> > > (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33))
> > >
> > > I was looking into the Ganglia documentation in Hadoop wiki. It says we
> > > need
> > > to install a patch inorder to get it working. I am not sure if this ami
> > has
> > > all the patched installed for ganglia.
> > >
> > > Thanks!
> > > Samprita
> > >
> > >
> > > On Tue, Sep 8, 2009 at 10:22 PM, Matt Massie <matt@cloudera.com>
> wrote:
> > >
> > > > I should be able to help you out.
> > > >
> > > > What AMI are you using?  What linux distribution?
> > > >
> > > > -Matt
> > > >
> > > > On Tue, Sep 8, 2009 at 6:21 PM, Samprita Hegde <sampritavh@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hello All,
> > > > >    I am trying to study the performance of ec2 cluster when it is
> > > running
> > > > > hadoop. But I am not able get ganglia up and running. Can someone
> > > please
> > > > > guide me as how to use/configure Ganglia to be able to run with
> > hadoop
> > > .
> > > > I
> > > > > am using an public ec2 image that has hadoop-0.19.0. I have used
> the
> > > > > ganglia
> > > > > configuration that comes with the hadoop-ec2 scripts in the
> > > > hadoop-package.
> > > > >
> > > > >
> > > > > The ganglia version  that is running is ganglia 3.0.5.
> > > > >
> > > > > Thanks and Regards,
> > > > > Samprita
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message