hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sri Ramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1857) Ability to run a script when a task fails to capture stack traces
Date Mon, 08 Oct 2007 12:16:52 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533110
] 

Amareshwari Sri Ramadasu commented on HADOOP-1857:
--------------------------------------------------

bq. You need to fix the new find bugs warning.
The warning is harmless. May be we will supress it.

bq. I can't see the need to have different debug scripts for mappers and reducers.
We need two scripts, since mapper and reducer code are entirely different. Many times, we
may need debg only one of them. For example, streaming will have two different scripts for
mapper and reducer. And users would like to debug them seperately.

bq. I think all of the output (stdout and stderr) from the debug script should be put together
when it is stored on the task tracker.
This can be done by concatenating the files if we want.  But redirection in the command is
not possible, since we dont know the order.

bq. I don't think adding the concept of executable to the file cache is appropriate. It is
basically compensating for the lack of permissions in hdfs, which will be addressed more directly.
In the mean time, I think that all files coming out of the cache should have the "x" permission
set. Note that pipes and streaming already do this...
ok. This can be done.  Then, should we do symlink for all files?

bq. Why were the config files for the pipes examples changed to add the "#" part of the url?
For running gdb script by default, we need the c++ executable program to be present in current
working directory. So, we need to have symlink of the executable.

bq. Rather than let the user specify a command line that has a bunch (of undocumented) @varaibles,
I think it would be better to always use the same parameters: basically, something like: $script
@stdout@ @stderr@ @jobconf@ and let the script find the core file if it cares.

Now, we are using @stdout@, @stderr@, @syslog@ and @core@ for the command. 
Since pipes have gdb default script which needs core file, we can have that code.  And that's
a convience given to the user. If you insist, we can remove that.

bq. By default the entire output of the script should be added to diagnostic and 5 is much
much too small.
ok. This will be done.



> Ability to run a script when a task fails to capture stack traces
> -----------------------------------------------------------------
>
>                 Key: HADOOP-1857
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1857
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.14.0
>            Reporter: Amareshwari Sri Ramadasu
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.15.0
>
>         Attachments: patch-1857.txt, patch-1857.txt, patch-1857.txt, patch-1857.txt,
patch1857.txt
>
>
> This basically is for providing a better user interface for debugging failed
> jobs. Today we see stack traces for failed tasks on the job ui if the job
> happened to be a Java MR job. For non-Java jobs like Streaming, Pipes, the
> diagnostic info on the job UI is not helpful enough to debug what might have
> gone wrong. They are usually framework traces and not app traces.
> We want to be able to provide a facility, via user-provided scripts, for doing
> post-processing on task logs, input, output, etc. There should be some default
> scripts like running core dumps under gdb for locating illegal instructions,
> the last few lines from stderr, etc.  These outputs could be sent to the
> tasktracker and in turn to the jobtracker which would then display it on the
> job UI on demand.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message