hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1181) userlogs reader
Date Tue, 03 Apr 2007 18:08:32 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12486448

Doug Cutting commented on HADOOP-1181:

> Get all tasklogs for a given jobid
> $ hadoop job <id> -tasklogs

The natural use of this is for debugging, right?  So one might pipe this into 'grep | sort
| uniq -c'.  Except, for a big job, that's not scalable.  So personally I'd prioritize HADOOP-1199
ahead of this.  Or am I misunderstanding how this will be used?

> userlogs reader
> ---------------
>                 Key: HADOOP-1181
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1181
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: stack@archive.org
>         Attachments: hadoop1181-v2.patch, hadoop1181.patch
> My jobs output lots of logging.  I want to be able to quickly parse the logs across the
cluster for anomalies.  org.apache.hadoop.tool.Logalyzer looks promising at first but it does
not know how to deal with the userlog format  and it wants to first copy all logs local. 
Digging, there does not seem to currently be a reader for hadoop userlog format.  TaskLog$Reader
is not generally accessible and it too expects logs to be on the local filesystem (The latter
is of little good if I want to run the analysis as a mapreduce job).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message