hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joshi, Rekha" <Rekha_Jo...@intuit.com>
Subject Re: How get information messages when a JobControl is used ?
Date Wed, 12 Sep 2012 11:41:21 GMT
Good that web hdfs is sufficient for now, Piter!
The counters are part of o.a.h.mapreduce.Job so you can get them as job.getCounters()..etc
or via JobInProgress. It is not a JobControl feature as such, so they will not be directly
in JobControl/ControlledJob API.

However Bertrand's point is an important one, if there are identified synchronization, concurrency
issues on JobControl, the values on webhdfs too will reflect that ..unless as the collection
of counters is happening within context..hmm..well..
Do please keep us updated if the values you see seem incorrect!

Thanks
Rekha

From: Piter85 Piter85 <piter1485@hotmail.it<mailto:piter1485@hotmail.it>>
Reply-To: <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Date: Wed, 12 Sep 2012 13:12:23 +0200
To: <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: RE: How get information messages when a JobControl is used ?

Hi Rekha and Bertrand! Thanks for the answers! Ok I see that in web interface (_logs->history->job_.....)
there are infos about executions of jobs.
I hope that this infos will be enough for me.

As I said before, scanning APIs, the only method that I found was ControlledJob:toString().

Bye! :)

Piter

________________________________
Date: Wed, 12 Sep 2012 12:21:24 +0200
Subject: Re: How get information messages when a JobControl is used ?
From: dechouxb@gmail.com<mailto:dechouxb@gmail.com>
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>

But as far as I know there is no way to have a snapshot of the JobControl state.
https://issues.apache.org/jira/browse/MAPREDUCE-3562

I was trying only to get the state of all jobs and it is not possible to get a consistent
view.
For Map/Reduce progress, I guess you could the same by digging into the APIs.
But I am afraid you will have the same problems.

Regards

Bertrand

On Wed, Sep 12, 2012 at 12:09 PM, Joshi, Rekha <Rekha_Joshi@intuit.com<mailto:Rekha_Joshi@intuit.com>>
wrote:
Hi Piter,

JobControl just means there are multiple complex jobs, but you will see the information for
each job on your hadoop web interface webhdfs still, wouldn't you?
Or if that does not work, you might need to use Reporters/Counters to get the log info data
in custom format as needed.

Thanks
Rekha


From: Piter85 Piter85 <piter1485@hotmail.it<mailto:piter1485@hotmail.it>>
Reply-To: <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Date: Wed, 12 Sep 2012 11:34:27 +0200
To: <user@hadoop.apache.org<mailto:user@hadoop.apache.org>>
Subject: How get information messages when a JobControl is used ?

Hi! I'm using JobControl (v. 1.0.3)  to chain two MapReduce applications. It works and creates
output data, but it doesn't give me back information messages as number of mappers, number
of records in input or in output, etc...

It only returns messages like this :
12/09/12 09:56:38 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments.
Applications should implement Tool for the same.
12/09/12 09:56:38 INFO input.FileInputFormat: Total input paths to process : 4

I tried to use ControlledJob's toString() method but it returns me only this kind of message:

job name:    songsort
job id:    jobctrl1
job state:    RUNNING
job mapred id:    job_201209120942_0005
job message:    just initialized
job has 1 dependeng jobs:
     depending job 0:    canzoni

Any idea to get back remainder infos?

Bye!



--
Bertrand Dechoux

Mime
View raw message