hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hongbin Huang <huang_hong_...@yahoo.com.cn>
Subject Re: logging on Hadoop
Date Sun, 22 Apr 2007 04:51:22 GMT
Hi all,
I am evaluating a 5-nodes Nutch0.9/Hadoop0.12.2 cluster (4GB mem, 2xADM64bit CPU, 1.5TB storage)
in China, we are starting a vertical search engine project. After some more testing, we are
planning to scale it to 50-pc-based-nodes shortly.  

We have been runing into similar logging issues. 

My fix is to add one line "hadoop.log.file=hadoop.log.file" in the log4j.properties file:
# Daily Rolling File Appender

After this, the following errors were gone:
java.io.FileNotFoundException: / (Is a directory)
at java.io.FileOutputStream.openAppend(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:177)
at java.io.FileOutputStream.<init>(FileOutputStream.java:102)

Please try this. Hope this works for you also.


Hi Mathijs,

On Fri, Apr 20, 2007 at 09:01:58PM +0200, Mathijs Homminga wrote:
>Hi all,
>I'm a bit confused by the way logging works on Hadoop.
>In short, my question is: where does the log from my Nutch plugins end 
>up when running on Hadoop?
>I'm running Nutch 0.9 on Hadoop 0.12.2.
>When I run my code on a single machine I can see that the log ends up in 
>${hadoop.log.dir}/${hadoop.log.file}, as defined in the log4j.properties 
>file (Nutch and my plugins use commons-logging).
>But when I use Hadoop, I can't find any logfile which contains log 
>entries generated by the (map|reduce) tasks. However, I do find logfiles 
>which contain log from Hadoop-related classes (like the tasktracker, 
>jobtracker etc).

The logs from the map/reduce tasks go into the ${hadoop.log.dir}/userlogs/${taskid} directory.
They are in a specific format which aids browsing through the web-ui (webui steps: JT ->
job -> task -> tasklogs).


>I first thought it had something to do with HADOOP-406 
>(http://issues.apache.org/jira/browse/HADOOP-406) which is about the 
>fact that environment parameters passed to the parent JVM (like 
>'hadoop.log.file') are not passed to the child JVM's. But even when I 
>specify my log file explicitly (without the use of environment vars) in 
>the log4j.properties, I still see no log entries other than the Hadoop 
>Any clues?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message