hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amareshwari Sri Ramadasu <amar...@yahoo-inc.com>
Subject Re: log
Date Thu, 01 Apr 2010 04:26:36 GMT
Along with JobTracker maintaining history in ${hadoop.log.dir}/logs/history, in branch 0.20,
the job history is available in a user location also. User location can be specified for configuration
“hadoop.job.history.user.location”. By default, if nothing is specified for the configuration,
the history will be created in output directory of the job. The user history can be disabled
by specifying the value “none” for configuration.

Gang, if you are not seeing the history for some of your jobs, there could be a couple of

 1.  Your job does not have any output directory. You can specify a different location for
user history.
 2.  Job history got disabled for some problem with Job’s configuration. You can check JobTracker
logs here and verify if the history got disabled.


On 4/1/10 12:13 AM, "Gang Luo" <lgpublic@yahoo.com.cn> wrote:

Thanks Abhishek.
but I observe that some of my job output has no such _log directory. Actually, I run a script
which launch 100+ jobs. I didn't find the log for any of the output. Any ideas?


----- 原始邮件 ----
发件人: abhishek sharma <absharma@usc.edu>
收件人: common-user@hadoop.apache.org
发送日期: 2010/3/31 (周三) 1:15:48 下午
主   题: Re: log


In the log/history directory, two files are created for each job--one
xml file that records the configuration and the other file has log
entries. These log entries have all the information about the
individual map and reduce tasks related to a job--which nodes they ran
on, duration, input size, etc.

A single log/history directory is created by Hadoop and files related
to all the jobs executed are stored there.


On Tue, Mar 30, 2010 at 8:50 PM, Gang Luo <lgpublic@yahoo.com.cn> wrote:
> Hi all,
> I find there is a directory "_log/history/..." under the output directory of a mapreduce
job. Is the file in that directory a log file? Is the information there sufficient to allow
me to figure out what nodes the job runs on? Besides, not every job has such a directory.
Is there such settings controlling this? Or is there other ways to get the nodes my job runs
> Thanks,
> -Gang

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message