Return-Path: Delivered-To: apmail-hadoop-general-archive@minotaur.apache.org Received: (qmail 95682 invoked from network); 8 Nov 2010 18:40:13 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Nov 2010 18:40:13 -0000 Received: (qmail 67313 invoked by uid 500); 8 Nov 2010 18:40:43 -0000 Delivered-To: apmail-hadoop-general-archive@hadoop.apache.org Received: (qmail 67225 invoked by uid 500); 8 Nov 2010 18:40:42 -0000 Mailing-List: contact general-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@hadoop.apache.org Delivered-To: mailing list general@hadoop.apache.org Received: (qmail 67211 invoked by uid 99); 8 Nov 2010 18:40:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Nov 2010 18:40:42 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of shujamughal@gmail.com designates 209.85.213.176 as permitted sender) Received: from [209.85.213.176] (HELO mail-yx0-f176.google.com) (209.85.213.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Nov 2010 18:40:38 +0000 Received: by yxn22 with SMTP id 22so4189267yxn.35 for ; Mon, 08 Nov 2010 10:40:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=fgpaQZV6HVjOMDNiR+huQPaAzCaeYKzQjGMV8BcQdbY=; b=kx57pzVN8zB615+5RUAy0op/Jzdy06q50iN6Rr4zgRvQGX2ZNcnmHusceFDd9d2/nM 1enI1gCqyUSroWoDPS7HahdvKSq8hsLwXENwTQCIb0qbEutxaU7PBG4Ro/xYbjAIHjtY q3zdFORsuZ2stsYwBXzweDbx3XoRPenCQdgPc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=QliA1BQszD0PizE7RWEyxzO8URaEALZW9s+Z1jDJ7sJqmMOx2yt249ZpfWBPBGNqlK RcLecdzLzgj63e1p4z26BKNGAtSLzc70PMYOlRfDL/EYSUgUwv4GZTTn+MHHphwD+IWT mF9MiYZ6fDfPuSEYupGoK98U0PP0pmrW/ZExo= MIME-Version: 1.0 Received: by 10.216.11.131 with SMTP id 3mr371827wex.92.1289241615528; Mon, 08 Nov 2010 10:40:15 -0800 (PST) Received: by 10.216.10.69 with HTTP; Mon, 8 Nov 2010 10:40:14 -0800 (PST) In-Reply-To: References: <8280F561-1AE3-459E-8A94-C20B4CDD739A@announcemedia.microsoftonline.com> Date: Mon, 8 Nov 2010 23:40:14 +0500 Message-ID: Subject: Re: Configure Ganglia with Hadoop From: Shuja Rehman To: common-user@hadoop.apache.org Cc: "general@hadoop.apache.org" Content-Type: multipart/alternative; boundary=0016364c7d9985f67e04948ef40f --0016364c7d9985f67e04948ef40f Content-Type: text/plain; charset=ISO-8859-1 Hi I have follow the article, i have one confusion do i need to change gmond.config file on each node?? host { location = "unspecified" } /* Feel free to specify as many udp_send_channels as you like. Gmond used to only support having a single channel */ udp_send_channel { mcast_join = 239.2.11.71 port = 8649 ttl = 1 } /* You can specify as many udp_recv_channels as you like as well. */ udp_recv_channel { mcast_join = 239.2.11.71 port = 8649 bind = 239.2.11.71 } /* You can specify as many tcp_accept_channels as you like to share an xml description of the state of the cluster */ tcp_accept_channel { port = 8649 } and i need to replace the 239.2.11.71 with that specific machine ip e.g 10.10.10.2 in 1st machine case and 10.10.10.3 for 2nd machine and so on? On Mon, Nov 8, 2010 at 10:07 PM, Abhinay Mehta wrote: > Me and a colleague of mine (Ryan Greenhall) setup Ganglia on our hadoop > cluster, he has written a summary of what we did to get it to work, you > might find it useful: > > http://forwardtechnology.co.uk/blog/4cc841609f4e6a021100004f > > Regards, > Abhinay Mehta > > > On 8 November 2010 15:31, Jonathan Creasy >wrote: > > > This is the correct configuration, and there should be nothing more > needed. > > I don't think that these configuration changes will take affect on the > fly > > so you would need to restart the datanode and namenode processes if I > > understand correctly. > > > > When you browse your you will see some more metrics: > > > > dfs.FSDirectory.files_deleted > > dfs.FSNamesystem.BlockCapacity > > dfs.FSNamesystem.BlocksTotal > > dfs.FSNamesystem.CapacityRemainingGB > > dfs.FSNamesystem.CapacityTotalGB > > dfs.FSNamesystem.CapacityUsedGB > > dfs.FSNamesystem.CorruptBlocks > > dfs.FSNamesystem.ExcessBlocks > > dfs.FSNamesystem.FilesTotal > > dfs.FSNamesystem.MissingBlocks > > dfs.FSNamesystem.PendingDeletionBlocks > > dfs.FSNamesystem.PendingReplicationBlocks > > dfs.FSNamesystem.ScheduledReplicationBlocks > > dfs.FSNamesystem.TotalLoad > > dfs.FSNamesystem.UnderReplicatedBlocks > > dfs.datanode.blockChecksumOp_avg_time > > dfs.datanode.blockChecksumOp_num_ops > > dfs.datanode.blockReports_avg_time > > dfs.datanode.blockReports_num_ops > > dfs.datanode.block_verification_failures > > dfs.datanode.blocks_read > > dfs.datanode.blocks_removed > > dfs.datanode.blocks_replicated > > dfs.datanode.blocks_verified > > dfs.datanode.blocks_written > > dfs.datanode.bytes_read > > dfs.datanode.bytes_written > > dfs.datanode.copyBlockOp_avg_time > > dfs.datanode.copyBlockOp_num_ops > > dfs.datanode.heartBeats_avg_time > > dfs.datanode.heartBeats_num_ops > > dfs.datanode.readBlockOp_avg_time > > dfs.datanode.readBlockOp_num_ops > > dfs.datanode.readMetadataOp_avg_time > > dfs.datanode.readMetadataOp_num_ops > > dfs.datanode.reads_from_local_client > > dfs.datanode.reads_from_remote_client > > dfs.datanode.replaceBlockOp_avg_time > > dfs.datanode.replaceBlockOp_num_ops > > dfs.datanode.writeBlockOp_avg_time > > dfs.datanode.writeBlockOp_num_ops > > dfs.datanode.writes_from_local_client > > dfs.datanode.writes_from_remote_client > > dfs.namenode.AddBlockOps > > dfs.namenode.CreateFileOps > > dfs.namenode.DeleteFileOps > > dfs.namenode.FileInfoOps > > dfs.namenode.FilesAppended > > dfs.namenode.FilesCreated > > dfs.namenode.FilesRenamed > > dfs.namenode.GetBlockLocations > > dfs.namenode.GetListingOps > > dfs.namenode.JournalTransactionsBatchedInSync > > dfs.namenode.SafemodeTime > > dfs.namenode.Syncs_avg_time > > dfs.namenode.Syncs_num_ops > > dfs.namenode.Transactions_avg_time > > dfs.namenode.Transactions_num_ops > > dfs.namenode.blockReport_avg_time > > dfs.namenode.blockReport_num_ops > > dfs.namenode.fsImageLoadTime > > jvm.metrics.gcCount > > jvm.metrics.gcTimeMillis > > jvm.metrics.logError > > jvm.metrics.logFatal > > jvm.metrics.logInfo > > jvm.metrics.logWarn > > jvm.metrics.maxMemoryM > > jvm.metrics.memHeapCommittedM > > jvm.metrics.memHeapUsedM > > jvm.metrics.memNonHeapCommittedM > > jvm.metrics.memNonHeapUsedM > > jvm.metrics.threadsBlocked > > jvm.metrics.threadsNew > > jvm.metrics.threadsRunnable > > jvm.metrics.threadsTerminated > > jvm.metrics.threadsTimedWaiting > > jvm.metrics.threadsWaiting > > rpc.metrics.NumOpenConnections > > rpc.metrics.RpcProcessingTime_avg_time > > rpc.metrics.RpcProcessingTime_num_ops > > rpc.metrics.RpcQueueTime_avg_time > > rpc.metrics.RpcQueueTime_num_ops > > rpc.metrics.abandonBlock_avg_time > > rpc.metrics.abandonBlock_num_ops > > rpc.metrics.addBlock_avg_time > > rpc.metrics.addBlock_num_ops > > rpc.metrics.blockReceived_avg_time > > rpc.metrics.blockReceived_num_ops > > rpc.metrics.blockReport_avg_time > > rpc.metrics.blockReport_num_ops > > rpc.metrics.callQueueLen > > rpc.metrics.complete_avg_time > > rpc.metrics.complete_num_ops > > rpc.metrics.create_avg_time > > rpc.metrics.create_num_ops > > rpc.metrics.getEditLogSize_avg_time > > rpc.metrics.getEditLogSize_num_ops > > rpc.metrics.getProtocolVersion_avg_time > > rpc.metrics.getProtocolVersion_num_ops > > rpc.metrics.register_avg_time > > rpc.metrics.register_num_ops > > rpc.metrics.rename_avg_time > > rpc.metrics.rename_num_ops > > rpc.metrics.renewLease_avg_time > > rpc.metrics.renewLease_num_ops > > rpc.metrics.rollEditLog_avg_time > > rpc.metrics.rollEditLog_num_ops > > rpc.metrics.rollFsImage_avg_time > > rpc.metrics.rollFsImage_num_ops > > rpc.metrics.sendHeartbeat_avg_time > > rpc.metrics.sendHeartbeat_num_ops > > rpc.metrics.versionRequest_avg_time > > rpc.metrics.versionRequest_num_ops > > > > -Jonathan > > > > On Nov 8, 2010, at 8:34 AM, Shuja Rehman wrote: > > > > > Hi > > > I have cluster of 4 machines and want to configure ganglia for > monitoring > > > purpose. I have read the wiki and add the following lines to > > > hadoop-metrics.properties on each machine. > > > > > > dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext > > > dfs.period=10 > > > dfs.servers=10.10.10.2:8649 > > > > > > mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext > > > mapred.period=10 > > > mapred.servers=10.10.10.2:8649 > > > > > > jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext > > > jvm.period=10 > > > jvm.servers=10.10.10.2:8649 > > > > > > rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext > > > rpc.period=10 > > > rpc.servers=10.10.10.2:8649 > > > > > > > > > where 10.10.10.2 is the machine where i am running gmeated and web > front > > > end. Will I need to same ip in all machine as i do here or need to > give > > > machine own ip in each file? and is there anything more to do to setup > it > > > with hadoop? > > > > > > > > > > > > -- > > > Regards > > > Shuja-ur-Rehman Baig > > > > > > > > -- Regards Shuja-ur-Rehman Baig --0016364c7d9985f67e04948ef40f--