hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-6884) Add LOG.isDebugEnabled() guard for each LOG.debug("...")
Date Wed, 25 Aug 2010 23:55:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-6884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902709#action_12902709

Konstantin Shvachko commented on HADOOP-6884:

h3. This issue

Based on what is said in this jira and discussions with developers I'd like to make the following
There are 3 main goals the change should be targeting for:
# *debug() statement optimization*: do not calculate the arguments if they are not going to
be logged.
# *code readability*: avoid if-debug-enabled statement before logging.
# *new code verification*: provide an automatic procedure or simple rules to verify that the
newly introduced code does not introduces that same inefficiencies.

My approach is a variant of Doug's proposal and SLF4J formatting.
# Let's create a new class {{org.apache.hadoop.log.Logger}}
We will have only static methods in the class for now:
class Logger {
  // Optimized logging
  static void logDebug(Log log, Object a1);
  static void logDebug(Log log, Object a1, Object a2);
  static void logDebug(Log log, Object a1, Object a2, Object a3);
  static void logDebug(Log log, Object a1, Object a2, Object a3, Object a4);

  // not efficient, do not use on performance critical paths
  static void logDebug(Log log, Object... args);
  static void logDebug(Log log, String formatter, Object... args);
  static void logDebug(Log log, Throwable e, Object... args);
# Each statement
{{Log.debug(a1 + a2 + ...)}} 
in current code is replaced by one of the methods above:
{{Logger.logDebug(Log, a1, a2, ...)}}
Looking at our code, about 90% of debug calls will be replaced by the first 4 methods,
which do not compute the arguments and do not instantiate an array object.
Most of the rest debug calls will be converted into a {{logDebug(Log log, Object... args)}}
Which is less efficient, but still faster than building a string.
The remaining calls can be transformed into "formatter" version. We should make sure these
are not
performance-critical parts of the code, or otherwise rewrite the messages to be able to call
of the optimized versions of {{logDebug()}}.
Autoboxing of primitive types should be ok with these less efficient versions of {{logDebug()}}.
# The implementation of {{logDebug()}} should use optimized string building tools:
  static void logDebug(Log log, Object a1, Object a2) {
    StringBuilder str = new StringBuilder().append(a1).append(a2);
# The JavaDoc for {{logDebug()}} methods should state which methods are optimized, and 
what to avoid when calling them, like primitive type autoboxing, string concatenation, etc.

This will give us all three goals above, up to a point. Efficiency and readability should
be obvious. 
The tool to verify should be checking that there are no "naked" calls of ".debug" anywhere
in the Logger. And that calls of {{logDebug()}} do not contain "+" operations.

This is not the ideal solution, but it is practical imo.

h3. Future directions

In the future we can turn {{org.apache.hadoop.log.Logger}} into a log wrapper by converting
the static methods into non static, and adding methods logInfo(), logError(), logWarn().
This should optimize logging in Hadoop in general. Non optimal string building in info-level
logging hearts the performance right now as I write it.
The idea is to replace {{org.apache.commons.logging.Log}} with {{org.apache.hadoop.log.Logger}}.
{{logging.Log}} will become a member of the {{Logger}}.
This is a bigger change and should be done in a different jira.

> Add LOG.isDebugEnabled() guard for each LOG.debug("...")
> --------------------------------------------------------
>                 Key: HADOOP-6884
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6884
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Erik Steffl
>            Assignee: Erik Steffl
>             Fix For: 0.22.0
>         Attachments: FunAgain.java, FunAgain.java, HADOOP-6884-0.22-1.patch, HADOOP-6884-0.22.patch
> Each LOG.debug("...") should be executed only if LOG.isDebugEnabled() is true, in some
cases it's expensive to construct the string that is being printed to log. It's much easier
to always use LOG.isDebugEnabled() because it's easier to check (rather than in each case
reason whether it's necessary or not).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message