But as far as I know there is no way to have a snapshot of the JobControl state.
I was trying only to get the state of all jobs and it is not possible to get a consistent view.
For Map/Reduce progress, I guess you could the same by digging into the APIs.
But I am afraid you will have the same problems.
JobControl just means there are multiple complex jobs, but you will see the information for each job on your hadoop web interface webhdfs still, wouldn't you?Or if that does not work, you might need to use Reporters/Counters to get the log info data in custom format as needed.
Hi! I'm using JobControl (v. 1.0.3) to chain two MapReduce applications. It works and creates output data, but it doesn't give me back information messages as number of mappers, number of records in input or in output, etc...
It only returns messages like this :
12/09/12 09:56:38 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
12/09/12 09:56:38 INFO input.FileInputFormat: Total input paths to process : 4
I tried to use ControlledJob's toString() method but it returns me only this kind of message:
job name: songsort
job id: jobctrl1
job state: RUNNING
job mapred id: job_201209120942_0005
job message: just initialized
job has 1 dependeng jobs:
depending job 0: canzoni
Any idea to get back remainder infos?