Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Sender: Bart Vandewoestyne <bart.vandewoestyne@gmail.com>
From: Bart Vandewoestyne <Bart.Vandewoestyne@telenet.be>
Message-ID: <54490AF0.1060809@gmail.com>
Date: Thu, 23 Oct 2014 16:04:32 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: user@hadoop.apache.org
Subject: Re: getting counters from specific hadoop jobs
References: <5448E769.8090907@gmail.com>
 <CALSJUsQxSTd0y9QVkciEq+Gf80HB_CugAQTfB=JEH4b5cYbLig@mail.gmail.com>
In-Reply-To: 
 <CALSJUsQxSTd0y9QVkciEq+Gf80HB_CugAQTfB=JEH4b5cYbLig@mail.gmail.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit

On 10/23/2014 02:56 PM, Dieter De Witte wrote:
> Maybe you could use job -list or job -history to get a list of the
> jobids and extract it from there?

That was indeed one of the methods I was thinking of, but I cannot think 
of a reliable way of implementing it.

Suppose I start a job with hadoop jar, and I wait until it is finished 
and then use `mapred job -list all` to somehow find out the job-id of my 
job that just finished.  Then how do I know what line in the output of 
`mapred job -list all` corresponds to the job I executed?  Even if the 
job output list would be sorted by start time, then I cannot be sure 
that the last started job is mine because another user could have 
started another job after me...

A mechanism that would easily allow a user to get the job-id from a job 
that he just started, would be nice to have.  Doesn't this exist?

Maybe grepping through the output of `mapred job -history all` would be 
the best solution to get to the counter information?  Unfortunately, I 
currently cannot test this approach as I am experiencing the following 
error:

bart@sandy-quad-1:~$ mapred job -history all 
/user/bart/terasort/output/0050GB
14/10/23 16:03:12 INFO client.RMProxy: Connecting to ResourceManager at 
sandy-quad-1.sslab.lan/192.168.35.75:8032
Ignore unrecognized file: 0050GB
Exception in thread "main" java.io.IOException: Unable to initialize 
History Viewer
	at 
org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.<init>(HistoryViewer.java:90)
	at org.apache.hadoop.mapreduce.tools.CLI.viewHistory(CLI.java:470)
	at org.apache.hadoop.mapreduce.tools.CLI.run(CLI.java:313)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
	at org.apache.hadoop.mapred.JobClient.main(JobClient.java:1239)
Caused by: java.io.IOException: Unable to initialize History Viewer
	at 
org.apache.hadoop.mapreduce.jobhistory.HistoryViewer.<init>(HistoryViewer.java:84)
	... 5 more

:-(

Kind regards,
Bart