Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 94E1C171D9 for ; Thu, 23 Oct 2014 14:04:49 +0000 (UTC) Received: (qmail 21848 invoked by uid 500); 23 Oct 2014 14:04:43 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 21749 invoked by uid 500); 23 Oct 2014 14:04:43 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 21735 invoked by uid 99); 23 Oct 2014 14:04:42 -0000 Received: from ec2-54-191-145-13.us-west-2.compute.amazonaws.com (HELO mx1-us-west.apache.org) (54.191.145.13) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 23 Oct 2014 14:04:42 +0000 Received: from mx1-us-west.apache.org (localhost [127.0.0.1]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTP id 4BCD125DD1 for ; Thu, 23 Oct 2014 14:04:42 +0000 (UTC) Received: by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org, from userid 114) id 4171C26EA7; Thu, 23 Oct 2014 14:04:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on mx1-us-west.apache.org X-Spam-Level: X-Spam-Status: No, score=-0.7 required=10.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_LOW,RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL,SPF_PASS,T_DKIM_INVALID autolearn=disabled version=3.4.0 Received: from mail-wg0-f43.google.com (mail-wg0-f43.google.com [74.125.82.43]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 41D1325DD1 for ; Thu, 23 Oct 2014 14:04:40 +0000 (UTC) Received: by mail-wg0-f43.google.com with SMTP id m15so1189001wgh.2 for ; Thu, 23 Oct 2014 07:04:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:message-id:date:user-agent:mime-version:to:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=ssjtxcJzAfnA4q9JM7iFo0A51XiqnZta9RksxcgDkzo=; b=rjZcw/WfzK/Jjfp6YMoAgqftbKXGKnA3DyK/ouXvKIxW9dFkKrjblr6ZGI8e4Rypn6 yEpbV0ABHQHdhuGhy2xA8o/RasLtzgRZPjWgDM30GaALofMSsin3UZt3LLm29rWQv/g7 oeCWY9Ua//VkDA+C4rgzusCK3QrJ9plW+avwOpH9iozLzjWGA18bvSmbvLNQIdfSNner oxy3LmtAFdIHzdfze8x1EXG2GbrMmFAAc5Ye50kVPgd++8VWasKtj8QQ9HAoXwtvH31I b5IA6yaQBFZ9JKrwG0Ts1Fq3RwxqOJLfjDGUoKUBswkICnRqf7vUhUcDyj5fC9PV7YFY 7nAQ== X-Received: by 10.194.3.2 with SMTP id 2mr5582657wjy.89.1414073073216; Thu, 23 Oct 2014 07:04:33 -0700 (PDT) Received: from ?IPv6:2001:6a8:2480:e001:ffff:ffff:ffc5:3640? ([2001:6a8:2480:e001:ffff:ffff:ffc5:3640]) by mx.google.com with ESMTPSA id cs2sm5551064wib.2.2014.10.23.07.04.32 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 23 Oct 2014 07:04:32 -0700 (PDT) Sender: Bart Vandewoestyne From: Bart Vandewoestyne X-Google-Original-From: Bart Vandewoestyne Message-ID: <54490AF0.1060809@gmail.com> Date: Thu, 23 Oct 2014 16:04:32 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: user@hadoop.apache.org Subject: Re: getting counters from specific hadoop jobs References: <5448E769.8090907@gmail.com> In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV using ClamSMTP On 10/23/2014 02:56 PM, Dieter De Witte wrote: > Maybe you could use job -list or job -history to get a list of the > jobids and extract it from there? That was indeed one of the methods I was thinking of, but I cannot think of a reliable way of implementing it. Suppose I start a job with hadoop jar, and I wait until it is finished and then use `mapred job -list all` to somehow find out the job-id of my job that just finished. Then how do I know what line in the output of `mapred job -list all` corresponds to the job I executed? Even if the job output list would be sorted by start time, then I cannot be sure that the last started job is mine because another user could have started another job after me... A mechanism that would easily allow a user to get the job-id from a job that he just started, would be nice to have. Doesn't this exist? Maybe grepping through the output of `mapred job -history all` would be the best solution to get to the counter information? Unfortunately, I currently cannot test this approach as I am experiencing the following error: bart@sandy-quad-1:~$ mapred job -history all /user/bart/terasort/output/0050GB 14/10/23 16:03:12 INFO client.RMProxy: Connecting to ResourceManager at sandy-quad-1.sslab.lan/192.168.35.75:8032 Ignore unrecognized file: 0050GB Exception in thread "main" java.io.IOException: Unable to initialize History Viewer at org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.(HistoryViewer.java:90) at org.apache.hadoop.mapreduce.tools.CLI.viewHistory(CLI.java:470) at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:313) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1239) Caused by: java.io.IOException: Unable to initialize History Viewer at org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.(HistoryViewer.java:84) ... 5 more :-( Kind regards, Bart