hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack@archive.org (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1199) want InputFormat for task logs
Date Tue, 10 Apr 2007 23:46:32 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack@archive.org updated HADOOP-1199:

    Attachment: hadoop1199-v2.patch

Version 2.  Keys, rather than LongWritable line numbers, are now a compound of host, taskid,
and line number: e.g. debord.archive.org:task_0023_m_000000_0:11223.

> want InputFormat for task logs
> ------------------------------
>                 Key: HADOOP-1199
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1199
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: mapred
>            Reporter: Doug Cutting
>         Attachments: hadoop1199-v2.patch, hadoop1199.patch
> We should provide an InputFormat implementation that includes all the task logs from
a job. Folks should be able to do something like:
> job = new JobConf();
> job.setInputFormatClass(TaskLogInputFormat.class);
> TaskLogInputFormat.setJobId(jobId);
> ...
> Tasks should ideally be localized to the node that each log is on.
> Examining logs should be as lightweight as possible, to facilitate debugging. It should
not require a copy to HDFS. A faster debug loop is like a faster search engine: it makes people
more productive. The sooner one can find that, e.g., most tasks failed with a NullPointerException
on line 723, the better. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message