Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 5234 invoked from network); 15 Feb 2007 18:09:27 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 15 Feb 2007 18:09:27 -0000 Received: (qmail 40822 invoked by uid 500); 15 Feb 2007 18:09:34 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 40796 invoked by uid 500); 15 Feb 2007 18:09:34 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 40787 invoked by uid 99); 15 Feb 2007 18:09:34 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Feb 2007 10:09:34 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Feb 2007 10:09:25 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id CB7627141E5 for ; Thu, 15 Feb 2007 10:09:05 -0800 (PST) Message-ID: <30153863.1171562945829.JavaMail.jira@brutus> Date: Thu, 15 Feb 2007 10:09:05 -0800 (PST) From: "Doug Cutting (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-492) Global counters In-Reply-To: <5641100.1156887324032.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12473451 ] Doug Cutting commented on HADOOP-492: ------------------------------------- I talked to Owen about this last week. My concerns are: 1. We should only instrument code once, for counters and for monitoring metrics. 2. Users should be able to easily add new counters & metrics to their code that are visible in the JobTracker web ui and/or a separate metrics monitoring system. 3. Counters should be accessible programatically through JobClient. One way to implement this would be to implement counters through the metrics API, as I've promoted above. Another approach would be to add a new counter-only API (a subset of metrics features) that routes values to the jobtracker, and can also be configured to talk to the metrics system. Then user code can decide whether to use the metrics API directly (for non-counter metrics) or use the counter-only API, and get the benefit of the JobTracker-based aggregation, built into the MapReduce runtime. I don't have a strong preference about which implementation strategy is pursued. > Global counters > --------------- > > Key: HADOOP-492 > URL: https://issues.apache.org/jira/browse/HADOOP-492 > Project: Hadoop > Issue Type: New Feature > Components: mapred > Reporter: arkady borkovsky > Assigned To: David Bowen > > It would be nice to have map / reduce job keep aggregated counts for arbitrary events occuring in its tasks -- the numer of records processed, the numer of exceptions of a specific type, the number of sentences in passive voice, whatever the jobs finds useful. > This can be implemented by tasks periodically sending pairs to the jobtracker (in some implementations such messages are piggy-backed on the heartbeats), so that the job tracker stores all the latests values from each task and aggregates them on a request. It should also make the aggregated values available at the job end. The value for a task would be flushed when the task fails. > #491 and #490 may be related to this one. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.