hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhinay Mehta <abhinay.me...@gmail.com>
Subject Re: Configure Ganglia with Hadoop
Date Mon, 08 Nov 2010 17:07:45 GMT
Me and a colleague of mine (Ryan Greenhall) setup Ganglia on our hadoop
cluster, he has written a summary of what we did to get it to work, you
might find it useful:

http://forwardtechnology.co.uk/blog/4cc841609f4e6a021100004f

Regards,
Abhinay Mehta


On 8 November 2010 15:31, Jonathan Creasy <jon.creasy@announcemedia.com>wrote:

> This is the correct configuration, and there should be nothing more needed.
> I don't think that these configuration changes will take affect on the fly
> so you would need to restart the datanode and namenode processes if I
> understand correctly.
>
> When you browse your you will see some more metrics:
>
> dfs.FSDirectory.files_deleted
> dfs.FSNamesystem.BlockCapacity
> dfs.FSNamesystem.BlocksTotal
> dfs.FSNamesystem.CapacityRemainingGB
> dfs.FSNamesystem.CapacityTotalGB
> dfs.FSNamesystem.CapacityUsedGB
> dfs.FSNamesystem.CorruptBlocks
> dfs.FSNamesystem.ExcessBlocks
> dfs.FSNamesystem.FilesTotal
> dfs.FSNamesystem.MissingBlocks
> dfs.FSNamesystem.PendingDeletionBlocks
> dfs.FSNamesystem.PendingReplicationBlocks
> dfs.FSNamesystem.ScheduledReplicationBlocks
> dfs.FSNamesystem.TotalLoad
> dfs.FSNamesystem.UnderReplicatedBlocks
> dfs.datanode.blockChecksumOp_avg_time
> dfs.datanode.blockChecksumOp_num_ops
> dfs.datanode.blockReports_avg_time
> dfs.datanode.blockReports_num_ops
> dfs.datanode.block_verification_failures
> dfs.datanode.blocks_read
> dfs.datanode.blocks_removed
> dfs.datanode.blocks_replicated
> dfs.datanode.blocks_verified
> dfs.datanode.blocks_written
> dfs.datanode.bytes_read
> dfs.datanode.bytes_written
> dfs.datanode.copyBlockOp_avg_time
> dfs.datanode.copyBlockOp_num_ops
> dfs.datanode.heartBeats_avg_time
> dfs.datanode.heartBeats_num_ops
> dfs.datanode.readBlockOp_avg_time
> dfs.datanode.readBlockOp_num_ops
> dfs.datanode.readMetadataOp_avg_time
> dfs.datanode.readMetadataOp_num_ops
> dfs.datanode.reads_from_local_client
> dfs.datanode.reads_from_remote_client
> dfs.datanode.replaceBlockOp_avg_time
> dfs.datanode.replaceBlockOp_num_ops
> dfs.datanode.writeBlockOp_avg_time
> dfs.datanode.writeBlockOp_num_ops
> dfs.datanode.writes_from_local_client
> dfs.datanode.writes_from_remote_client
> dfs.namenode.AddBlockOps
> dfs.namenode.CreateFileOps
> dfs.namenode.DeleteFileOps
> dfs.namenode.FileInfoOps
> dfs.namenode.FilesAppended
> dfs.namenode.FilesCreated
> dfs.namenode.FilesRenamed
> dfs.namenode.GetBlockLocations
> dfs.namenode.GetListingOps
> dfs.namenode.JournalTransactionsBatchedInSync
> dfs.namenode.SafemodeTime
> dfs.namenode.Syncs_avg_time
> dfs.namenode.Syncs_num_ops
> dfs.namenode.Transactions_avg_time
> dfs.namenode.Transactions_num_ops
> dfs.namenode.blockReport_avg_time
> dfs.namenode.blockReport_num_ops
> dfs.namenode.fsImageLoadTime
> jvm.metrics.gcCount
> jvm.metrics.gcTimeMillis
> jvm.metrics.logError
> jvm.metrics.logFatal
> jvm.metrics.logInfo
> jvm.metrics.logWarn
> jvm.metrics.maxMemoryM
> jvm.metrics.memHeapCommittedM
> jvm.metrics.memHeapUsedM
> jvm.metrics.memNonHeapCommittedM
> jvm.metrics.memNonHeapUsedM
> jvm.metrics.threadsBlocked
> jvm.metrics.threadsNew
> jvm.metrics.threadsRunnable
> jvm.metrics.threadsTerminated
> jvm.metrics.threadsTimedWaiting
> jvm.metrics.threadsWaiting
> rpc.metrics.NumOpenConnections
> rpc.metrics.RpcProcessingTime_avg_time
> rpc.metrics.RpcProcessingTime_num_ops
> rpc.metrics.RpcQueueTime_avg_time
> rpc.metrics.RpcQueueTime_num_ops
> rpc.metrics.abandonBlock_avg_time
> rpc.metrics.abandonBlock_num_ops
> rpc.metrics.addBlock_avg_time
> rpc.metrics.addBlock_num_ops
> rpc.metrics.blockReceived_avg_time
> rpc.metrics.blockReceived_num_ops
> rpc.metrics.blockReport_avg_time
> rpc.metrics.blockReport_num_ops
> rpc.metrics.callQueueLen
> rpc.metrics.complete_avg_time
> rpc.metrics.complete_num_ops
> rpc.metrics.create_avg_time
> rpc.metrics.create_num_ops
> rpc.metrics.getEditLogSize_avg_time
> rpc.metrics.getEditLogSize_num_ops
> rpc.metrics.getProtocolVersion_avg_time
> rpc.metrics.getProtocolVersion_num_ops
> rpc.metrics.register_avg_time
> rpc.metrics.register_num_ops
> rpc.metrics.rename_avg_time
> rpc.metrics.rename_num_ops
> rpc.metrics.renewLease_avg_time
> rpc.metrics.renewLease_num_ops
> rpc.metrics.rollEditLog_avg_time
> rpc.metrics.rollEditLog_num_ops
> rpc.metrics.rollFsImage_avg_time
> rpc.metrics.rollFsImage_num_ops
> rpc.metrics.sendHeartbeat_avg_time
> rpc.metrics.sendHeartbeat_num_ops
> rpc.metrics.versionRequest_avg_time
> rpc.metrics.versionRequest_num_ops
>
> -Jonathan
>
> On Nov 8, 2010, at 8:34 AM, Shuja Rehman wrote:
>
> > Hi
> > I have cluster of 4 machines and want to configure ganglia for monitoring
> > purpose. I have read the wiki and add the following lines to
> > hadoop-metrics.properties on each machine.
> >
> > dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
> > dfs.period=10
> > dfs.servers=10.10.10.2:8649
> >
> > mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
> > mapred.period=10
> > mapred.servers=10.10.10.2:8649
> >
> > jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
> > jvm.period=10
> > jvm.servers=10.10.10.2:8649
> >
> > rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext
> > rpc.period=10
> > rpc.servers=10.10.10.2:8649
> >
> >
> > where 10.10.10.2 is the machine where i am running gmeated and web front
> > end. Will  I need to same ip in all machine as i do here or need to give
> > machine own ip in each file? and is there anything more to do to setup it
> > with hadoop?
> >
> >
> >
> > --
> > Regards
> > Shuja-ur-Rehman Baig
> > <http://pk.linkedin.com/in/shujamughal>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message