hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raghu Angadi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3110) NameNode does logging in critical sections just about everywhere
Date Thu, 27 Mar 2008 22:26:24 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582814#action_12582814
] 

Raghu Angadi commented on HADOOP-3110:
--------------------------------------

> When there are no DFS clients, it's lightening fast, but add a few Map/Red jobs and the
thing really, really slows down. 

Are you implying this was traced to Log messages on NameNode? Of course there are a lot improvements
to NameNode other parts all the time. _This_ jira I thought was about log messages.  There
are so many low hanging fruits in Hadoop/HDFS w.r.t performance :)

Could you try an experiment where the log4j is configured not to write anywhere see if there
is any noticeable improvement?

Log message are included because there often the only way to debug a problem. We obviously
need to have a balance between complexity, maintainability, and benefits. So the question
here is how much does this save?


> NameNode does logging in critical sections just about everywhere
> ----------------------------------------------------------------
>
>                 Key: HADOOP-3110
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3110
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.14.0, 0.14.1, 0.14.2, 0.14.3, 0.14.4, 0.15.0, 0.15.1, 0.15.2,
0.15.3, 0.16.0, 0.16.1
>         Environment: All
>            Reporter: Pete Wyckoff
>
> e.g., FSNameSystem.addStoredBlock (but almost every method has logging in its critical
sections)
> This method is synchronized and it's spitting something out to Log.info every block stored.
Normally not a big deal, but since this is in the name node and these are critical sections...
> We shouldn't even do any logging at all in critical sections, so even the info and warn
are bad.  But, in many places in the code, it would be hard to tease these out (although eventually
they really should be), but the system could start using something like an AsyncAppender and
see how it improves performance. 
> Even though the log may have a buffer, the writing and doing the formatting and stuff
cause a drag on performance with 100s/1000s of machines trying to talk to the name node.
> At a minimum, the most often  triggered Log.info could be changed to Log.debug.
> for reference: http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/AsyncAppender.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message