Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-issues@hadoop.apache.org
Date: Thu, 14 Nov 2013 18:59:23 +0000 (UTC)
From: "Luke Lu (JIRA)" <jira@apache.org>
To: common-issues@hadoop.apache.org
Message-ID: <JIRA.12678588.1384181312613.75494.1384455563939@arcas>
In-Reply-To: <JIRA.12678588.1384181312613@arcas>
References: <JIRA.12678588.1384181312613@arcas>
Subject: [jira] [Commented] (HADOOP-10090) Jobtracker metrics not updated
 properly after execution of a mapreduce job
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HADOOP-10090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822772#comment-13822772 ] 

Luke Lu commented on HADOOP-10090:
----------------------------------

bq. I'm not sure where #3 differs from #2.

#3 is an improvement of #2, where cache TTL > regular snapshot interval, where jmx will get at least the same freshness of sinks, even with a longer TTL. Anyway, it appears #2 is easier to understand and serves typical use case (cache TTL < regular snapshot interval) well enough. 

bq. JMX will always return complete result, but the sink might miss some changes

You patch already introduced forceAllMetricsOnSource _after_ TTL expiry, it might be able to eliminate the problem with following changes?

Comments on the patch:
# forceAllMetricsOnSource doesn't need to be volatile as it's always read/written in synchronized sections.
# updateJmxCache now copies some logic of getMetrics and doesn't work with source metrics filtering (a feature regression). It seems to me that you can still reuse getMetrics by adding a check {{if (!calledWithAll)}} for resetting forceAllMetricsOnSource to false, so that next sink update will be consistent?


> Jobtracker metrics not updated properly after execution of a mapreduce job
> --------------------------------------------------------------------------
>
>                 Key: HADOOP-10090
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10090
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 1.2.1
>            Reporter: Ivan Mitic
>            Assignee: Ivan Mitic
>         Attachments: HADOOP-10090.branch-1.patch, OneBoxRepro.png
>
>
> After executing a wordcount mapreduce sample job, jobtracker metrics are not updated properly. Often times the response from the jobtracker has higher number of job_completed than job_submitted (for example 8 jobs completed and 7 jobs submitted). 
> Issue reported by Toma Paunovic.


--
This message was sent by Atlassian JIRA
(v6.1#6144)