hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sri Ramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1857) Ability to run a script when a task fails to capture stack traces
Date Wed, 10 Oct 2007 05:30:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533606
] 

Amareshwari Sri Ramadasu commented on HADOOP-1857:
--------------------------------------------------


Both stdout and stderr of debug script can be redirected to debugout.
And we dont need $jobconf in the command, we should have $syslog.

bq. Doesn't the "file" command when run on a core file give the executable name? Why does
the executable need to be in the current working directory? That doesn't sound right.

Here, executable has a symlink in the current working directory. We need to have a symlink
in current working directory or else we need to know the path of executable from the framework
and replace when needed. I feel a symlink is better than finding the path and replacing when
needed, as we need @program@ to replace. The '#' is added in the config files for the pipes
examples for creating symlink.

bq. In terms of the parameters, it just seems like the script should have a single interface
rather than supporting a bunch of variables that the user can put together. 

The interface we have now supports for both submitting a command (without doing any add cache
file) and a script file if he wants.  
Now, for pipes, we are adding the default debug command as 'gdb <program> -c <core>
-x <cmd-file>'. If we move this to a script, we do not know the <program> in the
script file, which would need <program> as argument for the script.  I feel the interface
we have now has more flexibility to the user, than single interface.


> Ability to run a script when a task fails to capture stack traces
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1857
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1857
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Amareshwari Sri Ramadasu
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.15.0
>
>         Attachments: patch-1857.txt, patch-1857.txt, patch-1857.txt, patch-1857.txt,
patch1857.txt, tt-no-warn.patch
>
>
> This basically is for providing a better user interface for debugging failed
> jobs. Today we see stack traces for failed tasks on the job ui if the job
> happened to be a Java MR job. For non-Java jobs like Streaming, Pipes, the
> diagnostic info on the job UI is not helpful enough to debug what might have
> gone wrong. They are usually framework traces and not app traces.
> We want to be able to provide a facility, via user-provided scripts, for doing
> post-processing on task logs, input, output, etc. There should be some default
> scripts like running core dumps under gdb for locating illegal instructions,
> the last few lines from stderr, etc.  These outputs could be sent to the
> tasktracker and in turn to the jobtracker which would then display it on the
> job UI on demand.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message