incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: [jira] [Commented] (HAMA-363) Add network condition monitoring function to BSPMaster
Date Tue, 10 May 2011 10:44:44 GMT
On 09/05/11 22:02, Thomas Jungblut (JIRA) wrote:
>
>      [ https://issues.apache.org/jira/browse/HAMA-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030895#comment-13030895
]
>
> Thomas Jungblut commented on HAMA-363:
> --------------------------------------
>
> As far as I know Hadoop only provides some JVM metrics and host metrices. I don't exactly
find the correct source code position, but I think we should implement our own metrics package,
which we can later add to ganglia. This is much more useful.
>
> We should define things we need to determine whether there are problems or not.
> Something like: "We ping every groom every 5 seconds and check the latency."
> This can be easily implemented in BSPMaster.
>
> To measure the IN and OUT rate or other fancy stuff we need something like heartbeat
communication that will transfer the local groom data to the master.
> This should be in the newer versions of Hadoop>0.21 shouldn't it? Don't have the source
codes haging around here.
>

if you are doing perf stuff, I'd go for having some plugin monitoring 
that can go in before/after communications.

Why? I'm playing with sFlow monitoring of bits of Hadoop, and it's 
tricky to retrofit this stuff deep into the code. If the hooks where 
there it's easier.

Mime
View raw message