hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3439) TaskTracker.addDiagnostics(String file, int num, String tag) could exit early if num==0
Date Fri, 23 May 2008 17:35:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12599442#action_12599442
] 

Doug Cutting commented on HADOOP-3439:
--------------------------------------

> loads in a conf option (that is not in hadoop-default, incidentally) 

The rule for whether things belong in hadoop-default.xml or not is whether or not they are
intended to be overridden in hadoop-site.xml.  Many parameters are only intended to be set
by code, adding these to hadoop-default.xml just clutters what's primarily meant to be documentation.
 Parameters meant to be set only by code should have static accessor methods on a relevant
class, e.g., Foo#setFoo(Configuration c, String value).  Also, it's reasonable to leave out
of hadoop-default.xml debugging parameters that are intended only for use by developers, not
by end users.

That's been the (unwritten?) policy.  Does it make sense?  If so, perhaps we should record
it somewhere...

> TaskTracker.addDiagnostics(String file, int num, String tag) could exit early if num==0
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3439
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3439
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.18.0
>            Reporter: Steve Loughran
>            Priority: Minor
>
> When a TaskTracker job finishes,  taskFinished() is invoked. 
> as part of its work it
>  1. loads in a conf option (that is not in hadoop-default, incidentally) , mapred.debug.out.lines
, default value -1;
>  2. calls addDiagnostics passing in that line count
> addDiagnostics either builds a string buffer of all the output, or creates a linear array
of lines and runs adds them, shuffling them up if there are more lines than expected. 
> This is all unneeded if the number of lines to print == 0; the entire reading in of the
output file can be skipped. This may speed up termination slightly on a run with a large output
file and mapred.debug.out.lines ==0. 
> Note also that a circular buffer would handle the lines>0 problem without having to
copy all the strings around.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message