hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack@archive.org (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-1181) userlogs reader
Date Wed, 28 Mar 2007 22:52:25 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack@archive.org updated HADOOP-1181:
--------------------------------------

    Attachment: hadoop1181.patch

Attached is a patch that changes TaskLog$Reader so it uses URLs instead of the file system.
 It also:

+ Adds a constructor that takes a userlog subdirectory URL.
+ Adds a public getInputStream method that streams over all userlog parts.
+ Makes TaskLog and TaskLog$Reader public rather than default access
+ Adds a main that takes a URL and that then prints to stdout the concatenated logs

I'll not mark this issue as 'patch ready' until others have had a gander.  Would be great
if Arun C Murthy could review since he wrote the original.  In particular, it would be nice
to know if the calculation of totalLogSize in the TaskLog$Reader#fetchAll method -- around
line 384 in r523437 -- is important.  If not, then some near-duplicate code could be replaced
with call to the new getInputStream in a version2 of this patch.

> userlogs reader
> ---------------
>
>                 Key: HADOOP-1181
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1181
>             Project: Hadoop
>          Issue Type: Improvement
>            Reporter: stack@archive.org
>         Attachments: hadoop1181.patch
>
>
> My jobs output lots of logging.  I want to be able to quickly parse the logs across the
cluster for anomalies.  org.apache.hadoop.tool.Logalyzer looks promising at first but it does
not know how to deal with the userlog format  and it wants to first copy all logs local. 
Digging, there does not seem to currently be a reader for hadoop userlog format.  TaskLog$Reader
is not generally accessible and it too expects logs to be on the local filesystem (The latter
is of little good if I want to run the analysis as a mapreduce job).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message